Close

Sumukh S

Software Engineer | NLP Researcher

Download Resume

About Me

Heyyo! I am Sumukh, and I am a Dual-Degree student at IIIT Hyderabad, working towards bachelors [B.Tech(with Honors)] and masters [MS (by research)] degrees, majoring in Computer Science, with a focus on Natural Language Processing and Machine Learning. I expect to graduate by June 2021.


I recently finished my internship at LinkedIn with the Data Science team, and have worked at startups and research labs prior to this in roles such as Software Engineer and Research intern (data science and AI/ML).


I am a member of the Machine Translation and Natural Language Processing lab, Language Technology Research Centre (LTRC), KCIS, and advised by Prof. Manish Shrivastava. I have served as a Teaching Assistant for undergrad and graduate level courses.


Research interests:

  • Natural Language Processing
  • Information Retrieval and Extraction
  • Multimodal Learning

I spend my free time by reading books that interest me, exploring new places, scrolling through reddit, or planning my next trip.


I am currently actively looking for Full Time Software Development or Machine Learning/NLP roles (June 2021 start), so if you have a position available or just want to say hi, my inbox is always open!

Work Experience

LinkedIn

Data Science Intern

Flagship Data Science team - LinkedIn Engineering

  • To start recommending events through My Network tab from the time of creation, I developed models using unsupervised ML and analysed the results. Experimented with GSDMM & LDA + word embeddings.

  • Drove a 14% lift in invite acceptance rate. Preliminary analysis with LinkedIn’s Interest Graph was also done for 100M MAU.
  • Set up a UMP flow to calculate CTR of event live video notifications on a daily basis and analysed the results.

Used: Python, scikit-learn, gensim, Hive/SQL

Indian School of Business(ISB), Hyderabad

ML and SWE Intern

  • To automate the loan approval process and enable banks to analyse small transactions, I built MVPs for the pilot tests using Django, MySQL & Docker, and developed predictive models using Statistical Machine Learning (ML) algorithms with careful feature engineering.
  • New data could easily be added to the model, with online learning, where the data would undergo the feature engineering process at run time.
  • Results from the pilot test for loan approval automation showed that the model was better than human loan officers, led to a journal publication. (Publication Link)

Used: Python, Docker, Django, MySQL/SQL, scikit-learn, XGBoost

Onward Assist

Data Science Intern

  • Worked on identification of cancer in liver tissue Whole Slide Images (WSI) and segmentation of affected areas using Deep Learning algorithms for Computer Vision.
  • Used U-Net FCN architecture to achieve a Jaccard Index of 0.654 and 0.647 for the tasks

Used: Python, PyTorch, scikit-learn

Marketfront Software Solutions

SWE Intern

  • To help users fetch and update details of their products and the track their inventory across different platforms, I led the group of 3 to build an Android Application, Webapp and a RESTful API. Custom QR codes could also be generated.

Used: Java [Android app development], Django, MySQL, Python, ReactJS/Javascript, HTML5, CSS

NLP lab, IIIT-Hyderabad

Undergraduate Research Assistant

  • Publication: Detection and Annotation of Events in Kannada at LREC 2020 (Workshop), Marseille, France (Paper Link)
  • Under the guidance of Professor Manish Shrivastava, working on information extraction from unstructured data.
  • Previously: Multimodal representation for Visual Question Answering & Machine Translation for low resource languages.

International Institute of Information Technology (IIIT), Hyderabad

Head Teaching Assistant

Head Teaching Assistant for the Graduate level, Advanced NLP course (covers Deep Learning + NLP) taken by PhD, Masters and Undergraduate (senior/junior) students.
  • Responsibilities included setting & grading assignments, grading of exam papers, hosting tutorials and mentoring course projects, mostly related to Question Answering & Machine Translation.

International Institute of Information Technology (IIIT), Hyderabad

Teaching Assistant

Course about database and its applications, taken by 200+ undergrads
  • Responsibilities includes setting & grading assignments, grading of exam papers, hosting tutorials and mentoring course projects.

Education

International Institute of Information Technology (IIIT), Hyderabad

July 2017 - Present (Expected June 2021)

BTech(with Honors) + MS (by research) in Computer Science Engineering

VVS Sardar Patel Pre-University College, Bangalore

Senior Secondary, Science, Math and Electronics (State Board)

Carmel High School, Bangalore

Secondary, Science, Math and Computer Science (ICSE)

Others: Indian National Mathematics Olympiad (INMO) Finalist, 2011

Jnanadeepa School, Shivamogga

Primary and Junior High (CBSE)

Projects

Word Problem Solver

Developed a model that given an arithmetic word problem, it extracts the relevant quantities,and creates the required expression tree by predicting the operators using Deep Reinforcement learning[DQN].

View Project

Custom Language Compiler

Developed a fully functional front-end of the compiler for a custom programming language, similar to C. Built parser, scanner, abstract syntax tree, interpreter for generating intermediate representation (LLVM IR) code for an input code file.

View Project

WikiPedia Search Engine

Created a search engine that uses Block-Sort-Based-Indexing to create the inverted index of the entire WikiPedia dump (73.3 GB), queries on the index and retrieves top 10 results via relevance ranking of the documents, implemented using tf-idf scoring.

View Project

Visual Question Answering

Implemented a system that takes an image and a question about the image as the input, and predicts the answer to the question. Used Bilinear Attention Networks [BAN].

View Project

Crowd Flow Prediction and Client Discovery Using Wireless Networks

Developed a system to identify crowd patterns by WiFi requests sent by mobile devices and triangulate client locations with WiFi routers. This data was also used to create heat-maps and perform time series analysis using ARIMA and Prophet.

View Project

Mini-Dropbox: P2P File Sync

Implemented an Application Level program for a P2P-network to keep two separate directories synced, similar to Dropbox. Used sockets to communicate; maintained file-indices, and MD5 hashes on all peers.

View Project

Machine Translation

Implemented Phrase Based Machine Translation Model and various Neural Machine Translation Models, including one using attention with modeling coverage, for translations between Hindi and Urdu languages. [Indian Languages]

View Project

Linux Shell

Implemented a command line interpreter in C which supports background jobs, environment variables, signal catching, piping and redirection with extensive error-handling.

View Project

Mini-SQL Engine

Implemented a small SQL engine with support for basic queries, joins and aggregate functions.

View Project

Ultimate Tic-Tac-Toe bot

Built a bot for 4x4x4 ultimate tic-tac-toe game which decides the next move on the computer generated board. Was among the Top 8 bots in a class of 120 [AI bot tournament]

View Project

Mini-projects

Some of the mini projects that I have worked on are: Pacman killer like game (2-D) & Legend of Zelda (3D) like game (C++, OpenGL), Tweaking the Xv6 Scheduler (C), HTTP proxy server with cache (Python), Quiz webapp (Ruby on Rails).

View Project

Skills

Get in Touch