This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
View analytic
Wednesday, September 6 • 15:00 - 15:45
Method 2 - Machine learning: Tractable Data Journalism

Sign up or log in to save this to your schedule and see who's attending!

Feedback form is now closed.
You want to understand your text data at hand? Probabilistic models are a prominent statistical tool to do so. For instance they have been used to automatically discovery the topics a collection of documents talks about. Unfortunately, using them is rather difficult. One reason is that inference is in generally intractable, i.e., it takes far too much time. Using independence and clustering you will see how to overcome this. You will get topic models where you can ask (almost) any question (captured by the model) and get the answer in reasonable time. In other words, you get a practical tool for data journalism.

Probabilistic models such as probabilistic topic models have had broad impact in data journalism, both in research and practice. Unfortunately, inference in unrestricted probabilistic models is often intractable. Motivated by the importance of efficient inference for data journalism, we will discuss two basic concepts—independence and clustering—and show how they lead naturally to tractable probabilistic models. This will be illustrated on text data leading to tractable topic models where next to topics you can also compute efficiently explorative statistics such as mutual information.


Wednesday September 6, 2017 15:00 - 15:45
Kaminzimmer OG (C26) Erich-Brost-Institut, Otto-Hahn-Str. 2, 44227 Dortmund

Attendees (7)