ML Conundrums - Prof. Sunita Sarawagi, IIT Bombay
Summary
- Hundreds of machine learning papers are uploaded to venues like arXiv every day. It is very easy for a beginner to be overwhelmed and feel lost. Can you please share some tips for reading papers and finding good problems?
- About talking to the industry, especially for applied ML students, how much exposure do you think a Ph.D. student needs, and when is the right time to get that exposure?
- I have a question related to what you had mentioned earlier. You were talking about getting started and you made a distinction between theoretical and applied ML. For finding applied ML problems, you talk to people and you brainstorm with them. Theoretical machine learning folks can in addition benefit from following someone else’s future work. But is it true that it is hard to start with theory in the absence of excellent guidance?
- This question is from one of our readers. The field of ML is too crowded to work on a single problem at a time. Is it good to work on multiple problems simultaneously as a safe bet?
- As a professor, how do you find time to write code?
- There are long periods of time where you are just looking for a problem. How do you stay motivated during this time?
- Other interesting questions answered in the longer version:
Prof. Sunita Sarawagi is the Institute Chair Professor in Computer Science and Engineering at IIT-Bombay. She received her B.Tech. in Computer Science from the Indian Institute of Technology, Kharagpur in May 1991. She received her M.S. and Ph.D. in Computer Science from the University of California at Berkeley where she studied under Michael Stonebraker. Following her Ph.D., Sarawagi did stints at IBM Almaden Research Center as a research scholar, Carnegie Mellon as visiting faculty, and joined IIT-Bombay in 1999. Between July 2014 and July 2016, Prof. Sarawagi was Visiting Scientist at Google Inc. in Mountain View where she worked on deep learning models for personalizing and diversifying YouTube and Play recommendations, improving Duo’s conversation assistance engine, and extracting attributes of classes from the Knowledge Graph. Among her many awards are the IBM Faculty award (2003 and 2008); Fellow of the Indian National Academy of Engineering (INAE) (2013); PAKDD Most Influential Paper Award 2014 for the paper: “Discriminative Methods for Multi-labeled Classification Shantanu Godbole and Sunita Sarawagi in PAKDD 2004”. Prof. Sarawagi has several patents to her name. They include a patent for “Database System and Method Employing Data Cube Operator for Group-By Operations” and a patent for “Efficient evaluation of queries with mining predicates”. The Infosys Prize 2019 in Engineering and Computer Science is awarded to Prof. Sunita Sarawagi for her research in databases, data mining, machine learning, and natural language processing, and for important applications of these research techniques. The prize recognizes her pioneering work in developing information extraction techniques for unstructured data.
Foreword: As a researcher in the field of machine learning being constantly bombarded and inundated with information, finding your way amidst the cacophony can be nerve-wracking. Read on as Prof. Sunita, a seasoned veteran in this field gives away her pearls. Some of the major questions that she addresses here are :
- How to prudently navigate through the ever-exploding sea of ML literature?
- What is the best time to go for a research internship?
- How many problems should you be working on as an ML researcher? and many more… Stanly Samuel, Ph.D. student, CSA, IISc
This interview is between Shubham Gupta and Prof. Sunita Sarawagi.
Hundreds of machine learning papers are uploaded to venues like arXiv every day. It is very easy for a beginner to be overwhelmed and feel lost. Can you please share some tips for reading papers and finding good problems?
First, one should understand that it is very difficult to come up with a good research problem. And now, with the overcrowding in the field of machine learning, it’s becoming even harder.
Some tips which I can provide are: first, it might be useful to attend workshops [rather than] conferences within areas which are aligned with one’s area of Ph.D. In workshops, you get exposed to futuristic topics. Second, I find it useful to talk to the industry. So, if one goes for an internship, then one should not just be talking to the small group in which one is interning but generally find out what everyone else is doing, particularly in a large organization.
I work in applied machine learning (ML) and I often get ideas by talking to people. Another source is brainstorming with your own graduate students or colleagues. Ideally, it would be perfect if you could brainstorm with more experienced researchers, but just by brainstorming with your advisor and by being creative, you can come up with new ideas.
If you are trying to look for what to work on from a paper then you are already too late by about a year. I rarely try to look at a paper’s future work section [to identify problems] while reading published papers. Sometimes it might work, like in a closed community such as the old theory community where people know what the frontier is and you can follow up on someone’s future work. But, in applied ML, I cannot think of this happening very often. It is not enough to just look at [someone else’s] future work section. When you read papers, you have to think out of the box.
About talking to the industry, especially for applied ML students, how much exposure do you think a Ph.D. student needs, and when is the right time to get that exposure?
It depends on the advisor. If the advisor has a lot of exposure, then maybe the student might get away with not building his or her own exposure early in the Ph.D. process. But I think that even if a student already has a good topic to work on, going to the industry helps because it gives the student a different perspective and it also increases confidence. Usually, people in the industry treat their colleagues much better and hence you feel important. In contrast, in the academic world, typically it’s a war. In academia, faculty advisors [are often] themselves overworked and they have no incentives to be extra nice to people. There are of-course exceptions. So, I feel that going to industry shakes you away from the monotony of your Ph.D.
There is no one good answer [regarding the right time to go for an internship]. I went for an internship towards the end of my Ph.D. and that helped me in getting a job in the company of my choice. It depends on your goal: if you want a job, going to an internship late when you are more impressive helps as you will appear as a strong applicant to the future industry. When I was doing my Ph.D., my area of research was not considered hot and it was very difficult to get into [the industry]. But, because I went there as a fourth-year Ph.D. student, I was able to impress people more than I would have had I gone as a first-year Ph.D. student.
That is one factor. If you are looking for a research problem, then, of course, you want to go earlier. Maybe in your second year. In your first year, unless you already have a lot of exposure to research, your time is best spent with your Ph.D. advisor because then you can do your own thinking before you get any external influence. I would generally think that going in the middle of a Ph.D. may not be a good idea unless the problem that you work on is directly related [to your internship problem].
I have a question related to what you had mentioned earlier. You were talking about getting started and you made a distinction between theoretical and applied ML. For finding applied ML problems, you talk to people and you brainstorm with them. Theoretical machine learning folks can in addition benefit from following someone else’s future work. But is it true that it is hard to start with theory in the absence of excellent guidance?
Yes, I think that is true. You might also benefit from talking to people who are doing computer science theory [not specifically ML theory]. They think very differently. I did some of my theory papers on my own, but I also got a lot of help from my more theory-oriented co-authors in some of the other papers. But, I believe, a Ph.D. student’s time is a great time to also try to attempt theory, even though you may not be trained, because one can try to do incremental theory.
One template that might work is: suppose if you are trying to solve a problem and someone else has solved another problem where proof techniques and theory follow from a similar template, then even if your problem is different, you can still borrow their proof techniques. I remember the time when I was working on a class prior shift problem. I was looking at published papers that had used similar methods for a related problem, and I was able to write the proof by extending [the existing results]. Other times, you venture out and do things where you are not already strong. [Having] a co-author will help a lot. However, you first have to get their interest.
This question is from one of our readers. The field of ML is too crowded to work on a single problem at a time. Is it good to work on multiple problems simultaneously as a safe bet?
I think there is no one good answer here. If you know that the problem that you are working on has a greater chance of being scooped off because there are other people who are also working on the same problem, then you are better off putting your entire energy on that problem and publishing before someone else can beat you to it. Otherwise, you might want to diversify.
At least when I was a graduate student, I found it much easier to work on one problem at a time. Because you would get obsessed about that one problem. In general, I think if one is very unsure about [selecting a] problem or if one wants to maintain a sense of progress, then one can work on multiple problems.
It also depends on the institute. Some institutes are happy with collaborators. So you can have two students working together while one student focuses more on one idea. Particularly when you are in an environment like IISc, the Ph.D. students can work with masters and bachelor’s students. That way, you can get the other student to do some of the day-to-day work while you focus on the ideas.
I think that it is difficult for one student to multiplex across problems if he or she is the major contributor in all of them. It adds a necessary switch-over overhead. I have been best while working on one problem at a time. As an assistant professor, I would have maybe two-three problems. But then, of course, I had students who would be making progress on them. Even then, I used to do a lot of coding myself. When I say that I was working on one problem, I would be doing the whole thing: coding, running experiments, writing the paper, and so on. And then it would get really difficult to work on multiple problems while also teaching and advising.
As a professor, how do you find time to write code?
I used to love writing code. I would not believe or trust anyone else’s code. So, I would actually write code myself. For example, most of the coding was done by me for our KDD 2014 paper [in addition to] collecting the data, labeling it, and writing the paper. Nowadays it has become a bit difficult. Now I have a lot more students [and it is hard to find time to code].
There are long periods of time where you are just looking for a problem. How do you stay motivated during this time?
That is actually a good one. First, it is very important to stay connected, even when you are looking for problems. Even though your advisor might be avoiding you, you have to keep pestering your advisor. Continue to go for your weekly or biweekly meetings. Tell him/her what you have done, what your thoughts are. Ask what they think about the problem. Do not vanish. Because that absolutely does not help. Do not assume that I am going to solve this problem or come up with a problem and then meet my guide. Maintain regular contact.
Second, even when you are looking for problems, to stay motivated you should attend talks. All these talks are online nowadays. Once in a while, maybe every two-three days, listen to a talk. Even if it is peripherally related. It will excite your brain and will throw some ideas which you would have otherwise not thought about.
Third, I find it helpful while looking for problems to have reading groups. Then you have other people discussing. But frankly, to have an effective reading group, you must ensure that everyone participates equally. Keep a smaller reading group of somewhat energetic people who have good ideas and discuss with them. You have to mix reading with thinking. If you just do the thinking, you might go into a rabbit hole. It is like exploration vs exploitation.
Make sure to record your thoughts at the end of the day. Otherwise, it might be very demoralizing. You may have thoughts like: “I have not done anything”, or “I spent 15 days or one month and what is my output?”. Keep track of your thoughts. Maintain a logbook and keep track of what you discuss. When I am thinking about ideas, I take a notebook and break it up. Maybe you will be exploring three-four paths. I break up that notebook into three-four parts and I write down my thoughts. Do not keep it all in your head. It will also help when you meet your advisor or other students.
You should also maintain a deadline. As in, “if by this point I do not come up with an idea, I would take the next step”. Figure out what that next step would be.
One more thing I would say about staying motivated and not getting depressed: I am very scared of students getting depressed and losing it. Just go out, exercise. Both physical and mental alertness is required to do good research. Many Ph.D. students make the mistake of isolating themselves. They spoil their health. Sometimes, just going for a run, going for a swim, it helps. It gives you new ideas. Stay healthy.
Other interesting questions answered in the longer version:
- Here is another question from our readers. Most of the ML research has been reduced to beating the accuracy numbers. Is it a good way to go forward, and if not, then how should we proceed?
- I want to ask a related question. It’s about people who publish a lot of papers. I believe that paper count is not a good metric. Do you think that using paper count as a metric leads to the kind of research where you can tweak things slightly to beat the state-of-art results, without worrying about the quality of the idea?
- How would your advice to a junior Ph.D. student differ from that to a senior Ph.D. student?
- The next question is about your industry experience. How do you think industry experience is different from academia? And, what advice would you give to an Indian student?
- Would you like to give advice specifically for the Indian students? I have been told that if you just send a cold e-mail to the professors [regarding an internship] or if you apply online to places like Google, it is extremely unlikely that you will be picked up. They get thousands and thousands of applications. How do we deal with this?
- There is noise in the review process. Is there something that can be done to fix the noise? How serious is this problem?
- What do you mean by a two-tier reviewing system?
Credits:
-
Compiled and edited by: Shubham Gupta and Stanly Samuel
-
Interviewed and transcribed by: Shubham Gupta
CSA Writing Team (CWT)