Natural Language Processing
DSCI 6004-2: Natural Language Processing Term Project Proposal Slides (5%) Due Thursday 03/20/2024 11:59pm ET Code Demo and Presentation (5%) Due Thursday 04/17/2024 11:59pm ET Final Submission (15%) Due Thursday 04/24/2024 11:59pm ET The purpose of this term project is to demonstrate your practical skills in applying deep learning methods to solve interesting problems in NLP. You may choose one of the suggested projects or propose your own project. Students may do final projects solo, or in teams of up to 3 people. We strongly recommend you do the final project in a team. Larger teams are expected to do correspondingly larger projects, and you should only form a 3-person team if you are planning to do an ambitious project where every team member will have a significant contribution. The first milestone in completing the project is to prepare and submit a project proposal (deadline: Thursday 03/21/2024 11:59pm ET), consisting of a maximum of 9 slides, outlining the following: • • • • • • • • Team Members (on the cover page) Project topic Statement of project objectives Statement of value – why is this project worth doing? Review of the state of the art and relevant works (max. 2 slides – include citations) Approach (i.e., what algorithms, datasets, models, tools and techniques do you intend to use to achieve the project objectives) Deliverables (i.e., a list of items that will be submitted upon completion, and their relevance to the stated objectives) Evaluation methodology (i.e., how can I check to see whether you have achieved the objectives or not – must include a list of relevant metrics.) The second milestone is to record a short (< 10 minutes) demonstration of your project (i.e., running code and presenting your slides) to your peers. This video must be uploaded on YouTube, but if you wish to keep it private, you can choose to post it as an “unlisted” video. (deadline: Thursday 04/17/2024 11:59pm ET). The third milestone is to submit a project report, composed as a paper written in the style of an ACL/NeurIPS/AAAI, etc. conference submission. It should begin with an abstract and introduction, clearly describe the proposed idea, present technical details, give results, compare to baselines, provide analysis and discussion of the results, and cite sources throughout (you’ll probably want to cite at least 5-10 papers depending on how broad your topic is). If you are working in a team of two or three, the paper should be on the order of 8 pages excluding references; working alone, you should target more like 5-6 pages. Don’t treat these as hard page requirements or limits, and let the project drive things. If you have lots of analysis and discussion or are trying something more ambitious, your paper might be longer; if you’re implementing something complex but succinctly described, your paper might be shorter. Note that your project is not graded solely on the basis of results. You should feel free to try an idea that’s a bit “out there” or challenging as long as it’s well-motivated and can be evaluated. Critically, you should also approach the work in such a way that success isn’t all-or-nothing. You should be able to show results, describe some successes, and analyze why things worked or didn’t work beyond “my code errored out.” Think about structuring your proposal in a few phases (like the projects) so that even if everything you set out to do isn’t successful, you’ve at least gotten something working, run some experiments, and gotten some kind of results to report. Upload your completed project (incl. code and slides) to a new Github repository, along with a user documentation manual as a .MD file describing the project and usage instructions to other interested students and researchers. (deadline: Thursday 04/24/2024 11:59pm ET ). Grading Rubric: We will grade the project reports according to the following rubric: Clarity/Writing (3 points): Your paper should clearly convey a core idea/hypothesis, describe how you tested it/what you built, and situate it with respect to related work. See the “Tips for Academic WritingLinks to an external site” if you have doubts about what is expected. Implementation/Soundness (5 points): Is the idea technically sound? Do you describe what seems like a convincing implementation? Is the experimental design correct? Results/Analysis (7 points) Whether the results are positive or negative, try to motivate them by providing examples and analysis. If things worked, what error classes are reduced? If things didn’t work, why might that be? What aspects of the data/model might not be right? If you’re writing a paper that revolves around building a system, you should try to report results for a baseline from the literature, your own baseline, your best model, and possibly results of ablation experiments. Choosing a Topic There are a few directions you can go with this project. You might do a more engineering-style project: pick a task and a dataset, design or expand on some model, and try to get good results, similar to what you were doing in the first three projects. You can also do a more analytical project: pick some problem and try to characterize it in greater depth. What does the data tell us? What does this tell us about language or about how we should design our NLP systems? Your project should include something novel: your end goal shouldn’t be just reimplementing what others have done. However, implementing someone else’s model or downloading and running an existing model are great first steps and might end up getting you most of the way there, and implementing a couple of approaches in order to gain some insight from comparing them can be a good project. One good way to attack things is to pick a task and a dataset, download and run a model from the literature, and assess the errors to see what it does wrong. While it’s best to go in with some intuition of how you can improve things, letting yourself be guided by the data and not sticking to assumptions that may prove incorrect is the best way to build something that actually works well. Be bold in your choice! This project is not graded on how well your system works, as long as you can convincingly show that your model is doing something. Start with baby steps rather than implementing the full model from scratch: build baselines and improve them in a direction that will eventually take you towards your full model. The initial projects in this class are structured to do this, to give you an example of this process. The following is a (non-exhaustive!) list of tasks and corpora, just a few to give you some pointers. Another approach is to look through the papers in recent ACL/EMNLP conferences and see if there are topics that seem interesting to you, then try to find datasets for those tasks. Text annotation tasks: Tasks like POS tagging, NER, sentiment analysis, and parsing are well understood and have been thoroughly studied; it is hard to improve on state-of-theart models for these on English datasets. However, other domains (web forums, biomedical text, Twitter), and other languages are less well understood, but datasets exist for these and there are small “cottage industries” of papers around each of these topics. Many of the state-of-the-art English systems for these tasks have been discussed in class— perhaps download these and see how they compare to other models on new data. Entity Linking: Entity linking involves resolving a span of text in a document (John Smith) to a Wikipedia article capturing that entity’s true identity ( https://en.wikipedia.org/wiki/John Smith (explorer)Links to an external site. ). Classical methods use data from Wikipedia and use features such as cosine similarity of tf-idf vectors between the source context and target Wikipedia article (Ratinov et al., 2011). A newly released dataset (Eshel et al., 2017) is much cleaner and larger and more admissible to training neural network models. Multilingual approaches (Sil et al., 2018) might also be nice to investigate or follow up on. Multilingual Settings: Pre-trained models like ELMo and BERT have been extensively studied for English. Because of the word piece abstraction, there is clear transfer to related languages that use Latin script. Because of code-mixed data, these models also have some success for frequent languages that don’t share an alphabet with English, such as Chinese. However, for more distant languages like Thai which have their own script, these models underperform. Past work (Pires et al., 2019) has some analysis of this on a few basic tasks, but there’s a lot more to investigate here. Interpretability/Probing Neural Networks: Given the success of neural models, particularly BERT, there is increased interest in understanding them: what their representations capture, how they generalize, etc. For example, we can using probing tasks to analyze the abilities of LSTMs to generalize along certain dimensions (Linzen et al., 2016) or to understand what the layers of BERT capture (Tenney et al., 2019). One viable project option is to try to improve our understanding of these models through new analyses or probing them in new ways. Note that with such projects, you should really be aiming to test a clear hypothesis and be able to accept/reject it based on your results. It’s not a good project to just say you’ll plot some aspect of BERT, then plot it and make handwavy conclusions about things. QA / Machine Reading: A plethora of question-answering datasets have been released recently: SQuAD (Rajpurkar et al., 2016), TriviaQA (Joshi et al., 2017), RACE (Lai et al., 2017), WikiHop (Welbl et al., 2017). Many of these datasets are hard to improve on with fully end-to-end neural network approaches (you need large BERT models to push the state-of-the-art). However, recent new datasets are always emerging (Dasigi et al., 2019; Lin et al., 2019). You can also feel free to tackle a subset of examples in a dataset or try out an interesting technique other than end-to-end neural networks. Building faster QA models is also potentially an interesting direction. CAUTION: Your project should not focus on new pre-training techniques for language models. Such experiments are too large-scale to feasibly execute even if you have access to significant other compute resources. Fine-tuning BERT-Base on a dataset can often be done effectively with more limited resources, but will still typically require GPUs. Moreover, the tasks of machine translation and summarization rely on training on particularly large datasets. There are good projects you can do in these domains, but you may wish to focus on low-resource settings or more traditional models, as large-scale neural approaches won’t be feasible to explore unless you have access to significant GPU resources. Computational Resources Available The operating assumption is that Google Colab and your personal computers should be sufficient for your project. However, if you believe your project merits the allocation of additional resources, you may submit a petition (in writing via email) by the proposal submission deadline.
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.