I’m interested in everything except anything i’m not. My ultimate goal is to lower the barriers of knowledge with technology – specifically through data. I was lucky to recieve my bachelor’s, master’s, doctoral degrees from Seoul National University, which has a wonderful campus based in South Korea. I currently work at Naver, and building Papago’s machine translation engines.
- 2016-Present: Machine learning scientist, Naver Corporation
- 2009-2016: M.S., Ph.D., Data Mining Lab, Seoul National University (Advised by Sungzoon Cho)
- 2004-2009: B.S., Industrial Engineering / Economics (minor), Seoul National University
- [Nov 2017] I’m organizing a NLP-OSS Workshop with Masato Hagiwara, Dmitrijs Milajevs, Liling Tan – a workshop devoted to open source software regarding NLP, to be co-located with ACL 2018.
- [Jun 2017] I’m organizing and participating in the Machine Learning Camp 2017 at Jeju. I will be mentoring Judit Ács, and we will be building a morphology segementation model for Hungarian!
- Distributed Representations focused on Finance Market (Under review)
- N3WS : 키워드 및 요약문장 추출을 이용한 인터랙티브 신문기사 탐색, 한국정보처리학회 추계학술발표대회, Nov 2017. 1
- 밑바닥부터 시작하는 데이터 과학: 데이터 분석을 위한 파이썬 프로그래밍과 수학, 통계 기초, 도서출판 인사이트, Jun 2016. 2
- Automated discovery of construction tacit knowledge based on text mining: A Preliminary Study, CIB World Building Congress, Tampere, Finland, May 2016.
- D3를 이용한 시각적 스토리텔링, 도서출판 인사이트, Jun 2015. 3
- Pseudo term vector representation for fast document clustering (for Korean), Domestic conference on Korean Institute of Industrial Engineering, Jeju, Korea, Apr 2015.
- 한국어 뉴스 기사에서 바이그램을 활용한 온라인 토픽 탐지 (Using bigrams for online topic detection in Korean news articles), Domestic conference on Korean Institute of Information Scientists and Engineers, Jeju, Korea, June 2015.
- 북한 신년사에 대한 자동화된 텍스트 분석: 1946-2015 (Text analysis of North Korean New Year addresses: 1946-2015, Korean Political Science Review, 49.2, pp. 27-61, Feb 2015.
- 한국어 형태소 분석기의 현황 및 특성 비교 (Survey and comparison of Korean open source morphological analyzers), Korea BI Data Mining Society (KDMS) Fall Conference, Nov 29 2014.
- 웹기반 한국어 워드클라우드 생성기의 개발 및 활용 (The development and application of a Web-based Korean wordcloud generator), Korea BI Data Mining Society (KDMS) Fall Conference, Nov 29 2014.
- KoNLPy: 쉽고 간결한 한국어 정보처리 파이썬 패키지 (Korean natural language processing in Python), Proceedings of the 26th Annual Conference on Human & Cognitive Language Technology, Chuncheon, Korea, 2014.
- Bridging the semantic gap in multimedia retrieval with topic extraction from user reviews search, INFORMS Big Data Conference. San Jose, United States, 2014.
- Data based segmentation and summarization for sensor data in semiconductor manufacturing, Expert Systems with Applications 41.6, pp. 2619-2629, 2014.
- 프로그램 리뷰 사이트와 Twitter를 통한 TV 프로그램 인기도 비교, 대한산업공학회/한국경영과학회 춘계공동학술대회, 2013.
- TV프로그램 정보 기반 자동녹화 방법론 개발, 한국경영과학회 추계학술대회, 2012.
- Feature selection for identifying high defect density sensors in semiconductor manufacturing, INFORMS International. Beijing, China, 2012.
- Parameter adaption for multivariate statistical process control in semiconductor manufacturing using genetic algorithms, 대한산업공학회/한국경영과학회 춘계공동학술대회, 2012.
- Robust segmentation for sensor data in semiconductor manufacturing, 한국BI데이터마이닝학회 춘계학술대회, 2012.
- Random Forest 기법을 사용한 저수율 반도체 웨이퍼 검출 및 혐의 설비 탐색, 한국 BI데이터마이닝학회 추계 학술대회. 2010.
- mRFS: Minimum redundancy feature selection based on a clustering filter, 한국 BI데이터마이닝학회 추계 학술대회. 2010.
- Feature selection for detecting faulty equipment parameters in semiconductor manufacturing process, 대한산업공학회 추계학술대회. 2010.
- 2017-11-15: My model has higher BLEU, can I ship it? – The Joel Test for machine learning systems, Invited talk at the AIMLP Workshop, ACML 2017. [html] [pdf]
- 2015-11-14: Document classification with distributed representations, Techtalk at Naver Labs.
- 2015-10-13: The beginner’s guide to 웹 크롤링 (스크래핑), Invited lecture at the “Methods of International Relations” course of SNU Department of Political Science with request by Jong Hee Park. [slideshare]
- 2015-09-04: 헬로, 데이터그램, Invited talk at the Datagram Facebook study group. [html] [pdf]
- 2015-08-29: 한국어와 NLTK, Gensim의 만남, PyCon Korea [html] [pdf] [slideshare]
- 2015-02-05: 린 정부와 대한민국 정치의 모든 것, Linked Open Data Annual Conference (Invited talk), Feb 5 2015. [html] [pdf]
- 2014-11-09: How we open the National Assembly in South Korea with technology (Politics in Korea), g0v Summit (Invited talk). [website] [code] [html] [pdf] [slideshare]
- 2014-10-17: 린 정부와 대한민국 정치의 모든 것, Invited talk at South Korea National Assembly Library. [html] [pdf]
- 2014-08-30: 자바 미안하다! Korean NLP with Python (KoNLPy), PyCon Korea [html] [pdf] [slideshare (ko)]
- 2014-05-27: 비개발자의 개발 이야기, Invited lecture at the “컴퓨터의 개론 및 특강” course with request by Yerim Choi). [html]
- 2014-05-07: Different roles in diffusing legislative data, Invited talk at OGP Asia Pacific Regional Conference with Cheol Kang. [html]
- 2014-02-11: Introduction to Politics in Korea & Team POPONG, Brown Bag Lunch at Sunlight Foundation. [html]
- 2013-11-30: 대한민국 정치의 모든 것: 쉽고 재밌는 일상 속 정치 만들기, Korean Semantic Web Conference (KSWC). [html]
- 2013-10-30: Expressing your data, Invited lecture at SNU Graduate School of Convergence Science and Technology, Web Application Systems Dept. with request by Myungdae Cho. [html]
- 2012-11-02: Introduction to data mining for newbies, Invited lecture at SNU Health Demography Lab with request by Youngtae Cho. [slideshare]
- 2009-11-23: On semi-supervised learning and beyond, Lab seminar at SNU Data Mining Lab [slideshare]
Honors and Awards
- Jun 2014: Lee Joong Han Award - Research Division Winner, Industrial Engineering Dept., SNU. 4
- Dec 2013: Samsung Tomorrow Solutions Competition - Winner, Samsung Electronics, Ministry of Science in Korea. 5
- Mar 2011: Ph.D. Scholarship, Samsung Electronics. 6
- Mar 2011: Global Ph.D. Fellowship, National Research Foundation in Korea. 7
Korean translation of Data Science from Scratch: First Principles with Python by Joel Grus. ↩
The Lee Joong Han Award (research division) is given annually to graduate students whose publication journal has the highest impact factor, among those with first authorships in the previous year within the Industrial Engineering department of Seoul National University. Granted $2,000. ↩
The Samsung Tomorrow Solutions Competition is a Korean national competition held by Samsung Electronics and the Ministry of Science of Korea, that aims to develop and implement solutions to make a change in our future. It was a 6 month long competition with more than 3000 participants and 1000 teams. Awarded the Minister of Science Award (first place). Prize approx. $50,000. ↩
Granted approx. $100,000 until graduation. ↩
The Global Ph.D. Fellowship is a government grant program for young researchers. In 2011, 300 scholars were granted among 1270 applicants. Granted approx. $90,000 for 3 years. ↩