Statistical Natural Language Processing Dietrich Klakow

News

16.09.2019

Leaderboard for the SNLP Project Competitive Mode

Hi everyone,

For those of you who submitted the project with a nickname for their team would get to partcipate in the optional competitive mode. 

The leaderboard is given below.
 

Leaderboard
Team Accuracy (%)
law of... Read more

Hi everyone,

For those of you who submitted the project with a nickname for their team would get to partcipate in the optional competitive mode. 

The leaderboard is given below.
 

Leaderboard
Team Accuracy (%)
law of RULE 2 77.7
Meow-Rawr 74.0
law of RULE 73.9
natasha 2 73.2
midnight 72.7
GEM 71.9
15minBreak 71.8
a_team_has_no_name 71.3
SadCringe 70.8
natasha 70.5
A3K 70.0
A3K 2 69.9
TGDG 69.4
Team_71 68.1

 

Thanks for everyone who decided in the competitive mode. This mode is just for fun and the results on the leaderboard is just to show you how your submissions stands ampong the overall participants in the project. Your final project grade will not be based on your rank on the leaderboard but mostly on the analysis provided in your report and how well-written your report is. You will get detailed feedback about your project submission (everyone) once all projects have been graded.

In case we missed to rank any team who would like to participate in the competitive mode, please write us an email before September 17th 00:00.

Best regards,
SNLP Team
 

16.09.2019

SNLP Exam Inspection Date

Hi everyone, 


The inspection date for SNLP exam papers is on 30th Spetmeber10:00 at C7 3 seminarraum. 
 

Best regards,

SNLP Team

30.08.2019

A Few Clarifications about SNLP Project

Hi everyone, 

We have received a few inquiries regarding the SNLP project by email. Therefore, we decided to make a few clarifications about the project and share them with everyone.

  • For the project, external packages that are allowed should be... Read more

Hi everyone, 

We have received a few inquiries regarding the SNLP project by email. Therefore, we decided to make a few clarifications about the project and share them with everyone.

  • For the project, external packages that are allowed should be directly installable through the pip install command. Those include well-known standard packages such as NLTK, Gensim, SpaCy, scikit-learn and others. Using non-standard modules from GitHub repositories, for example, would not be accepted.  
  • Although not explicitly stated in the project sheet, the submitted code for the project is expected to be well-structured and sufficiently documented. The tutor who would evaluate your project submission could decide to deduct points if they struggle to read your code or understand how it works due to the lack of good software engineering practices. 
  • Try to avoid inefficient solutions. However, if the algorithm you have used is computationally expensive but performs very well, you are advised to report the running time of your code so we are aware of that beforehand.
  • For the optional competitive mode, each team can participate with a maximum of two submissions. That is, you could make predictions on the test set with two different systems to see how they would stand on the leaderboard. Don't forget to provide a nickname for your team if you want to appear on the leaderboard of competitive mode.
  • Python Notebook solutions are accepted but you still have to submit a written PDF report with the requirements stated in the project sheet. 
  • For the report, a 4-6 page single-column report using the LaTeX COLING 2018 template (or something similar) is recommended. COLING LaTeX template can be downloaded here.

 

All the best with your project work and have a nice summer. 

 

Best regards,

SNLP Team

23.07.2019

SNLP Project is now online!

Dear SNLP participants,

We have released the project for the SNLP course. You will find the project document and the dataset under materials.

The project should be submitted through the CMS like any other assignment. Thus, it could be submitted any time before... Read more

Dear SNLP participants,

We have released the project for the SNLP course. You will find the project document and the dataset under materials.

The project should be submitted through the CMS like any other assignment. Thus, it could be submitted any time before the deadline, Spet. 13th 2019.

In the project, we have an optional competitive mode. If you want your system to appear on the leaderboard at the end of the project, you will have to choose a nickname for your team so the submission remains anonymous to the other teams.

We wish you the best for the rest of your exams and enjoy the summer break.

Best regards,
SNLP Team

 

18.07.2019

A few things to know about SNLP Final Exam

Dear SNLP participants,

Here are a few things to keep in mind before the SNLP final exam.

Time and location

  • The exam will take place this Friday, July 19th.
  • If you don't know where the exam hall, Physik Gr. HS, is located, make sure you find this hall... Read more

Dear SNLP participants,

Here are a few things to keep in mind before the SNLP final exam.

Time and location

  • The exam will take place this Friday, July 19th.
  • If you don't know where the exam hall, Physik Gr. HS, is located, make sure you find this hall in building C6 4 one day before the exam.
  • The exam will start at 8:00 sharp. Therefore, make sure you arrive at 7:45 to get properly seated.
  • The exam will end at 10:00. No time extension is possible in any case.
  • According to the statistics of previous years exams, there is always one student who comes 30 mins late because they couldn't find the exam hall, please don't be that student.

Exam Guidelines

  •  You are NOT allowed to use any electronic devices including calculators and mobile phones
  •  In case the question asks you to do calculations, you need to write the numbers and simplify the solution whenever possible. You will not be evaluated on the final answer but the correctness of your analysis.
  •  No sheets are allowed during the exam.
  •  You only need to bring pens, pencils, and a ruler in case you want your sketches to look neat, which you don't have to because of time constraints.
  •  In addition to the question sheet, you will be provided with plank papers and a stapler.
  •  At the end of the exam, you have to sort your answer sheets according to the order of the questions and staple them with the question sheets before submitting them.

Exam Structure

  • There are 14 questions in total in the exam, divided into two sections; section I and section II.
  • In section I, there are 10 questions where each question is worth 6 points.
  • You will have to answer 8 questions from section I, thus you should spend no more than 10 mins (on average) on each question from this section.
  • In section II, there are 4 questions where each question is worth 12 points.
  • You will have to answer 2 questions from section II, thus you should spend no more than 20 mins (on average) on each question from this section.
  • Make sure your answers are clear, concise, and up to the point. Writing non-relevant content will not grant you any points and will consume your time.
  • Eliminate anything that might consume your time, for example, using two pens of different colours.
  • Whenever you are asked to sketch a plot, don't forget to label the axes.

We wish all the best and good luck with your exam.

Best regards,
SNLP Team

15.07.2019

SNLP Office Hours - Tuesday 17:30 C7 1, 0.12

Hi everyone,

We would like to inform you that the tutors of the SNLP course; Ayan, Badr, Tatiana, and Virab, would have open office hours on Tuesday starting from 17:30 to 19:30 in C7 1, 0.12.

In case you have any questions for the exam on Friday, consider... Read more

Hi everyone,

We would like to inform you that the tutors of the SNLP course; Ayan, Badr, Tatiana, and Virab, would have open office hours on Tuesday starting from 17:30 to 19:30 in C7 1, 0.12.

In case you have any questions for the exam on Friday, consider coming during theses office hours. 

Best regards,
SNLP Team
 

28.06.2019

Assignment 10 + Chapter 9 slides are now online!

Dear SNLP participants,

 

Assignment 10 has been released, as always, you will find it with the datasets under Materials.


Moreover, the slides for today's lecture (graphical models in NLP), can be found in under Materials. Please be aware that the CRF... Read more

Dear SNLP participants,

 

Assignment 10 has been released, as always, you will find it with the datasets under Materials.


Moreover, the slides for today's lecture (graphical models in NLP), can be found in under Materials. Please be aware that the CRF section in chapter 9 'overwrites' CRF section in chapter 8 (specially the CRF software, since we have changed the recommended software package from CRF++ to CRFsuite). Everything else in chapter 8 is relevant (e.g., sequence labeling tasks, Markov random field, definition of cliques, etc.) 

Good luck.

Best regards,

SNLP Team

 

21.06.2019

Assignment 9 is now online!

Hi everyone,

Assignment 9 has been released, as always, you will find it with the datasets under Materials.

Good luck.

Best regards,

SNLP Team

14.06.2019

Assignment 8 is now online!

 

Hi everyone,

Assignment 8 has been released, as always, you will find it with the datasets under Materials.

Good luck.

Best regards,

SNLP Team

12.06.2019

A revision to assignment 7

Hi everyone,

Due to inconsistencies in the data table provided for exercise 1.3 in assignment 7, we release a new revision of the exercise with modified table. Everything else in the exercise sheet remains the same.

All the best.

Best regards,
SNLP Team

 

07.06.2019

Assignment 7 is now online!

Hi everyone,

 

Assignment 7 has been released, as always, you will find it under Materials.

Good luck.

Best regards,

SNLP Team

03.06.2019

CMS is back and Assignment 6 version 2 is now online

Hi everyone,

Due to some update in the CMS software, the system did not work probably on the weekend and assignment 6 was released on the course web page.

We have now released a new revision of the
exercise sheet that fixes a few minor typos to avoid potential... Read more

Hi everyone,

Due to some update in the CMS software, the system did not work probably on the weekend and assignment 6 was released on the course web page.

We have now released a new revision of the
exercise sheet that fixes a few minor typos to avoid potential confusions. Make sure that you get the most recent version of the exercise sheet. However, requirements remain unchanged.

Best wishes,
SNLP Team


 

28.05.2019

A minor revision to Assignment 5

Hi everyone,

We have updated exercise sheet 5 with one minor revision to exercise 2.1 (LM smoothing with absolute discounting).

The requirements of the exercise are unchanged, and the revision is only to make the formulation of absolute discounting LM clearer... Read more

Hi everyone,

We have updated exercise sheet 5 with one minor revision to exercise 2.1 (LM smoothing with absolute discounting).

The requirements of the exercise are unchanged, and the revision is only to make the formulation of absolute discounting LM clearer and more elaborate.

Please make sure you work with the latest revision of Assignment 5.

Best regards,
SNLP Team

24.05.2019

Assignment 5 is now online!

Hi everyone,

Assignment 5 has been released, you will find it under Materials.

For exercise 2.1, it is very important that you revise the course materials and build a solid understanding of the problem before getting your hands dirty with the implementation.... Read more

Hi everyone,

Assignment 5 has been released, you will find it under Materials.

For exercise 2.1, it is very important that you revise the course materials and build a solid understanding of the problem before getting your hands dirty with the implementation. The recommended readings are Chen & Goodman and Jurafsky & Martin, specially the sections related to backing-off, interpolation, and absolute discounting smoothing.

Regards

SNLP Team

17.05.2019

Assignment 4 is now online!

Hi everyone,

Assignment 4 has been released, you will find it under Materials.


All the best.

Kind regards,

SNLP Team

13.05.2019

Minor revisions to Assignment 3

Hi everyone,

We have updated exercise sheet 3 with a two minor revisions to exercise 1.1 (d) and 2.2 (c). The requirements for the two exercises have not changed, but we corrected a typo in exercise 1.1 (d) and added a clarification for 2.2 (c) to avoid potential... Read more

Hi everyone,

We have updated exercise sheet 3 with a two minor revisions to exercise 1.1 (d) and 2.2 (c). The requirements for the two exercises have not changed, but we corrected a typo in exercise 1.1 (d) and added a clarification for 2.2 (c) to avoid potential confusion regarding the indexing of the summation for the LM scoring function. Moreover, some of the comments in the supplementary code have been updated for consistency.

We hope that these minor changes would not cause a major inconvenience.

Best regards,
SNLP Team

 

11.05.2019

Assignment 3 is now online!

Hi everyone,

Assignment 3 has been released, you will find it under Materials, with a supplementary code and an English text corpus.

The first exercise is a revision on information theory. All related concepts are sufficiently covered in Manning and Schütze... Read more

Hi everyone,

Assignment 3 has been released, you will find it under Materials, with a supplementary code and an English text corpus.

The first exercise is a revision on information theory. All related concepts are sufficiently covered in Manning and Schütze (M&N) textbook, section 2.2 (Essential  Information  Theory).

The 2nd and 3rd exercise are on the topic of statistical language modelling. Besides M&N textbook, we provide a good reference on language models and smoothing techniques in the materials (Chen and Goodman). In Chen and Goodman, sections 1.1, 1.2, and 2.1 provide a clear overview of the topic with notations similar to we have used in the exercise sheet.

We hope that you have had a great experience with the course so far. We encourage you to use the forum to ask questions or bring up interesting discussion points.

Have a nice weekend and enjoy the lovely weather.

Best regards,
SNLP Team

08.05.2019

Statistical NLP Coursework and Final Evaluation

Dear students,

In this post, we list important information about the Statistical NLP course evaluation, assignments, final project, and written exam.


Assignments:

  • There will be 10 weekly assignments, each worth 20 points (total 200 pts)

  • Each... Read more

Dear students,

In this post, we list important information about the Statistical NLP course evaluation, assignments, final project, and written exam.


Assignments:

  • There will be 10 weekly assignments, each worth 20 points (total 200 pts)

  • Each assignment aims to improve your theoretical understanding as well as your practical skills (Python programming for NLP)

  • A team of 2-3 students should work on the assignment, then only one of the team members submits the solution to the CMS for grading

  • The tutors will grade the assignment and each team member will get their grade in the CMS accordingly

  • Exercises will be evaluated in three different criteria; completeness, correctness, and computational efficiency  

  • All assignments are due on Friday at 23:59

  • You have to follow the submission instructions in the exercise sheet carefully, otherwise, the tutors might decide not to grade your solution

  • Students can present a part of their solution in a tutorial session and earn extra points

  • A single presentation will be graded on a scale 1-5 point based on the quality of the solution as well as the clarity and the delivery of the presentation

  • Each student can give a maximum of 3 presentations, thus up to 15 points can be earned by presenting solutions

  • There will be some extra credit exercises in the assignments which can contribute to your points during the semester

 

Exam Qualification:

  • In order to qualify for the final exam, you need to obtain more than 122 points during the semester

 

Course Assessment:

  • Final written exam: 80% of the final grade

  • Final NLP project:  20% of the final grade


 

Best wishes,
SNLP Team

 

08.05.2019

Tutorials start this week!

Hi everyone,

All students who have registered in the CMS have been assigned to tutorial groups, taking preferences into account (as possible as could). 

The time-slots and locations are in the table below. We look forward to meeting you in the... Read more

Hi everyone,

All students who have registered in the CMS have been assigned to tutorial groups, taking preferences into account (as possible as could). 

The time-slots and locations are in the table below. We look forward to meeting you in the tutorials.

 

Best regards,
SNLP Team

 

        Tutor     Time     Location  
Group 1 Ayan Majumdar Thursday 14:15-15:45 E1 3, HS001
Group 2 Tatiana Anikina Friday 10:15:11:45 E1 3, Seminarraum 014
Group 3 Virab Gevorgyan Friday   14:15-15:45 C6 4, HS I

 

 

04.05.2019

Assignment 2 is released!

Hi everyone,

We have just released assignment 2, you will find it under Materials.

The first two exercises are a revision on probability theory. These exercises would be a few minutes work over a warm cup of tea on a quiet Sunday afternoon for students who... Read more

Hi everyone,

We have just released assignment 2, you will find it under Materials.

The first two exercises are a revision on probability theory. These exercises would be a few minutes work over a warm cup of tea on a quiet Sunday afternoon for students who have a decent background in probability. If you feel that you may not belong to that group, we highly advise reading Manning and Schütze's textbook Chapter 2, section 2.1 (Elementary Probability Theory) and sub-sections 2.2.1 (Entropy) and 2.2.5 (The noisy channel model).

Under Materials, we also provide a link to a standard graduate textbook on Probability Theory, if you feel that you would like to build a deeper understanding of the fundamentals of probabilistic modelling.   

We encourage you to use the Forum for asking questions related to the assignment sheets or bring up interesting discussion points.

Regards,
SNLP Team

24.04.2019

Assignment 1 released!

Assignment 1 is released. You will find it, with the data, in Materials.
 

Show all
 

Statistical Natural Language Processing

 

Audience

This advanced lecture addresses Bachelor and Master students in Computational Linguistics, Computer Science, CuK, Mechatronics, or Visual Computing.

Contents

We plan to cover the following topics:

  1. Introduction
  2. Natural Language as a Sequence of Symbols
  3. Basics of Language Modeling
  4. Entropy
  5. Backing-Off Language Modeling
  6. Text Classification
  7. Word Sense Disambiguation
  8. CRFs and Sequence Labeling
  9. Information Retrieval
  10. Machine Translation

 

Textbooks

  • Chris Manning and Hinrich Schütze, Foundations of Statistical Natural Language Processing.
  • Jurafsky, Daniel, and James H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition.

 

Organisation

Lectures take place in HS003 in building E1 3.

Time: Friday, 08:30-10:00

Starts: April 26th

Final Exam:

  • Time: Friday 19. Juli 2019 von 08:00 bis 10:00 Uhr
  • Location: Gebäude C6 4, Großer Hörsaal Physik


Privacy Policy | Legal Notice
If you encounter technical problems, please contact the administrators