This is the lecture video of my professor where he explains the assignment in detail Have you done assignment before in software
the assignment instructions This is the lecture video of my professor where he explains the assignment in detail Have you done assignment before in software development program
1
COMP 10261: The FAQ Bot Plus Project Sam Scott, Mohawk College, January 2022
OVERVIEW
An FAQ Bot answers questions about a particular topic. It is a
conversational interface to a stock set of questions and answers.
When an FAQ Bot receives an utterance, it determines the
user’s intent by matching that utterance to one of its stored
question and answer pairs. If it succeeds in determining intent in
this way, it uses the answer as its response. In the example on
the right (from Vajjala et al.’s Practical Natural Language
Processing) the FAQ Bot has determined that the first two
utterances have the same intent and has responded with the
same text in both cases.
If an FAQ Bot fails to determine intent, it usually outputs a
standard message to let the user know that it does not know the
answer. But your FAQ Bot Plus will use linguistic knowledge
from spaCy to get a bit chattier in this case.
This handout brings together all the project requirements for
the final project submission.
PHASE 1: FAQ BOT
In this phase, the goal is to update your Phase 0 FAQ Bot using fuzzy regular expressions to determine a
user’s intent.
1. From Phase 0 (Should already be complete). Determine your FAQ Bot’s knowledge domain and
prepare a set of 20 question and answer pairs. One easy way to do this is to find a long
Wikipedia page and copy sections of 1 to 3 sentences as each answer and generate a question
to go with each answer. Make sure you reference all online sources in comments.
2. Generalize by generating at least one more possible question for each answer. Ideally, the new
question should have a different wording, representing another way a user might ask for the
information in the answer.
3. Create a fuzzy regular expression for each answer that is capable of matching key parts of both
possible questions and is tolerant to a limited number of typos in each question.
4. Store questions, answers, and regular expressions in text files.
5. Create a Python program (or modify your Phase 0 FAQ Bot) to load the answers and regular
expressions from files, then allow the user to make utterances. Try to find the best match for
the user’s utterance from your list of regular expressions and output the corresponding answer
Are there limits to the size of dataset I can use for training?
Amazon Machine Learning can train models on datasets up to 100GB in size.
What is the maximum size of training dataset?
Amazon Machine Learning can train models on datasets up to 100GB in size.
What algorithm does Amazon Machine Learning use to generate models?
Amazon Machine Learning currently uses an industry standard logistic regression algorithm to generate models.
2
as a response. When there are multiple matches, you should have some strategy for
determining which match is better.
6. The bot should also respond to “hello” by greeting the user, and “goodbye” or “quit” by ending
the program. If it fails to match an utterance, the bot should politely let the user know that it
didn’t recognize their question.
Test your bot as much as possible. Use the original question, the alternate wordings, and any other
wordings you can think of. If possible, give the bot to a friend or family member to play with and see
how well it works for them. Tweak your regular expressions as necessary to get the best possible
performance.
PHASE 2: FAQ BOT PLUS
In this phase, the goal is to make the FAQ Bot a bit chattier or human-like using linguistic knowledge
from the spaCy module. It should still answer the user’s questions as before, but if it fails to figure out a
user’s intent, it should employ a range of strategies to try craft an appropriate response. This part of the
project is open-ended and creative, but you must make use of the spaCy pattern matcher with parts of
speech and/or lemmas in at least one part of your bot.
NAMED ENTITY RECOGNITION AND NOUN CHUNKS When the bot don’t know what the user is talking about, Named Entity Recognition or even Noun
Chunks could help implement a fallback strategy. Here are some examples:
Utterance: Does the college have a relationship with Twitter?
(SpaCy reports that Twitter is an organization – label ORG)
Response: Sorry I don’t know. I don’t work for Twitter.
Utterance: Does Chicago have any colleges?
(spaCy reports that Chicago is a geo-political entity – label GPE)
Response: Sorry, I don’t know. I’ve never been to Chicago.
Utterance: Where is the general store located?
(spaCy finds the noun chunk “the general store”)
Response: Sorry, I don’t know anything about the general store.
SPEECH ACT CLASSIFICATION To make the bot seem chattier or more human-like when it fails to match a user intent, you could
attempt to classify the speech act of the utterance. You can think of a speech act as a very high-level
intent that indicates what kind of action is the user trying to accomplish with their utterance. For
example, they could be asking a question, making a command, promising something, agreeing or
disagreeing with the bot, greeting the bot, etc. You might be able to figure this out by developing some
linguistic patterns in spaCy.
If the bot cannot determine the user’s intent using fuzzy regular expressions, it would at least be useful
to figure out if they are asking a question, trying to give you a command, or simply making a statement.
3
You could respond to questions with “Sorry, I don’t know the answer to that.” Or even “Sorry, I don’t
know about ___” if you can identify some noun phrase that represents what the user is asking about.
Commands could be responded to differently. “Sorry, I don’t know how to do that.” Or if you can figure
out what they want the bot to do, you could say “Sorry, I don’t know how to ___”.
EXAMPLE QUESTIONS To get you started, here’s a list of questions – see any patterns here?
Do you know anything about Jujitsu?
What is the capital of Albania?
How did you know that?
Where is my phone?
Why won’t you answer my questions?!?!?!
You’re what kind of bot, now?
Do I really have time for this…
(Note: The question marks are obviously a useful clue about whether something is a question or not, but
users will not always type them, and speech recognition systems might not include them when they
transcribe voice to text. Make sure you create patterns that will still work when there is no
punctuation.)
EXAMPLE COMMANDS And here’s a list of commands…
Give me info about Jujitsu.
Tell me something interesting.
Don’t say "I don’t know" again.
Go get me some useful information.
Make me a cup of coffee.
Drive me to the airport, please.
OTHER IDEAS What other things do you think a user might say to your bot? Can you use spaCy patterns to identify
more things you could respond to, or even plant some fun easter eggs for the user to find by saying
something that fits the right pattern? Feel free to implement any other ideas you may have on how to
make the bot chattier using linguistic knowledge. Have fun with it.
PHASE 3: DISCORD
Once the bot is working well in the Python shell, you should repackage it as a Discord bot and include a
link to add the bot to a server. If you want to host your Discord bot on CSUNIX or some other server, go
for it, but it’s not necessary as long as you hand in the code so that the instructor can run it themselves.
4
HANDING IN
You should place all the following into a single project folder, then zip it up and hand it in on Canvas.
1. A folder containing all the code and supporting files for your bot. It should be possible to run the
bot (both Discord and standalone) from this folder using Anaconda Python 3 with spaCy and the
English language models installed.
2. A text file called “phase 1.txt” containing the questions and answers that you used when
developing the FAQ Bot. There should be two questions for each answer, and it should be clear
which answer goes with which questions. I will use the questions in this file when I’m testing
your bot.
3. A text file called “phase 2.txt”. This file should contain any special instructions needed to get the
most out of the “chattier” aspects of your bot. How should we test your bot to see all the cool
stuff you included? Describe what kinds of utterances your bot can respond to and give us some
sample utterances that show your bot behaving at its chatty best.
4. A test file called “phase 3.txt”. This file should contain the link to the discord version of your bot
along with any special instructions required to talk to it (prefixes, etc.), or any other special
features you want to show off that are unique to this version of the bot.
5
EVALUATION
Your project will be marked out of 20 using the following Rubric.
Category Level 4: 100% Level 3: 75% Level 2: 50% Level 1: 25%
Phase 1: FAQ Bot (4 points)
Uses regex efficiently and effectively to answer all questions identified by the developer. Use fuzzy regex efficiently to tolerate of a small number of typos.
Uses regex to answer most questions correctly. Offers useful responses to novel questions some of the time. Uses fuzzy regex to tolerate of a small number of typos.
Uses regex and/or fuzzy regex to answer some questions correctly.
Correctly answers some questions.
Phase 2: FAQ Bot Plus (4 points)
Uses linguistic pattern matching and other linguistic knowledge to respond appropriately when user intent is unknown. Exhibits a range of responses and echo back phrases from the utterance in some cases.
Uses linguistic pattern matching or other linguistic knowledge to respond appropriately when user intent is unknown. Exhibits a range of such responses.
Uses linguistic pattern matching or other linguistic knowledge to respond appropriately sometimes when user intent is unknown. Exhibits some range of such responses.
Responds appropriately sometimes when user intent is unknown. Exhibits a limited range of such responses.
Phase 3: Discord (2 points)
Bot can be added to a discord server and functions as well as the Python shell version.
Bot can be added to a discord server and functions almost as well as the Python shell version.
Bot can be added to a discord server and responds to utterances.
Bot can be added to a discord server.
Code Structure (6 points)
Highly effective and efficient use of regex, fuzzy regex, and spaCy pattern matching. Uses highly modular and well-structured code. Discord and shell versions of the bot are identical other than the interface code.
Effective use of regex, fuzzy regex, and spaCy pattern matching and/or mostly modular code, shared between the two bot versions.
Uses regex, fuzzy regex, and spaCy pattern matching and/or somewhat modular code.
Limited use of regex, fuzzy regex, and spaCy pattern matching and/or limited modular structure.
6
Category Level 4: 100% Level 3: 75% Level 2: 50% Level 1: 25%
External Documentation (2 points)
Phase 1, 2, and 3 text files are present and complete. Instructions and test cases are complete enough to coax the best possible behavior from the bot.
Some of phase 1, 2, and 3 text files are present and/or the instructions and test cases are somewhat complete.
Internal Documentation (2 points)
Commenting and naming conventions are consistent the course standards (based on the PEP-8 and PEP-257). All files contain a docstring with a description, author information, and links to original sources. All functions contain a docstring with description of behavior, parameters, and return values.
Commenting and naming conventions are somewhat consistent with course standards and/or docstrings are missing or incomplete for some files.
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.