- Nov. 29: Our record for producing the largest high-performance cluster in the cloud (1,100,000 vCPUs!) was mentioned during the AWS re:Invent 2017 Keynote. Link to YouTube video.
- Oct. 15: A new version of our working paper "Scalable dynamic topic modeling with clustered latent Dirichlet allocation (CLDA)" is available on arXiv.
- Oct. 13-14: I'm at the amazing New Directions in Analyzing Text as Data (Text As Data 2017) conference at Princeton University.
- Oct. 9: Two of our papers were accepted to the 2017 IEEE International Conference on Big Data: "Representativeness of latent Dirichlet allocation topics estimated from data samples with application to Common Crawl" and "Detecting and summarizing emergent events in microblogs and social media streams by dynamic centralities".
- Oct. 8: Our paper "Automated cluster provisioning and workflow management for parallel scientific applications in the cloud" has been accepted to the 10th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS), which is held in conjunction with the 2017 International Conference for High Performance Computing, Networking, Storage and Analysis.
- Sept. 28: Our work on provisioning high performance computing clusters on Amazon Web Services (AWS) is featured on the AWS Blog.
- Sept. 12: Our paper "Database of Parliamentary Speeches in Ireland, 1919-2013", which introduces one of the largest repositories of legislative speeches for quantitative text analysis, has been accepted to the 1st IEEE International Conference on the Frontiers and Advances in Data Science. More information about the data is available under Data on this website.
- Sept.: I will be serving on the program committee for the 2nd Southern Data Science Conference (SDSC) in Atlanta.
- Aug.: I will be serving on the program committee for the 2017 Open Science in Big Data (OSBD) workshop, held in conjunction with the 2017 IEEE International Conference on Big Data.