-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Similar subtree matching notes
Gunther Cox edited this page Nov 21, 2016
·
4 revisions
Goal: Given a list of strings, find the most similar matching sequence in a graph data structure.
- The found sequence may have additional strings or strings that are not in the one being searched for.
- An optimal solution will have the greatest number of strings that have a close match in the original sequence, and the fewest number of extraneous strings added.
- The last statement in the best matching sequence that gets selected should have possible responses to choose from.
- Convert text to stemmed text
- Compare stemmed strings
A "smarter" comparison algorithm could equate similar sentences by their actual meaning. For example, sentences such as:
Have a free sample of this frozen dairy beverage. Try a sample of ice cream at no charge.
These two statements have very different text but the same meaning.
- Communication: A connection that represents the response relationship between two statemets and holds other meta data.
- speaker: F-key Persona
- date: datetime
- statement: F-key Statement
- in_response_to: F-key Statement - The parent node
- Conversation
- statements: A ordered list of Statement objects representing a conversation
- See Algorithms book, page 380
- Wildcard between each statement?
- Think of a sequence of strings in the graph as one long string?
- Tries data structure