I have two non-overlapping sets of items, with feature counts for each. What sta

ID: 650831 • Letter: I

Question

I have two non-overlapping sets of items, with feature counts for each. What standard algorithms can I use to extract the most statistically distinct features of each set?

For example:

Items served at American restaurants (5 restaurants surveyed):
bread: 4
burgers: 2
cheese: 1
cronuts: 2
pasta: 2
Items served at Italian restaurants (10 restaurants surveyed):
bread: 7
pasta: 10
cheese: 8

I want to be able to know that cronuts and burgers are distinctly associated with American restaurant menus, and cheese and pasta are distinctly associated with Italian restaurant menus.

Explanation / Answer

This looks like a standard machine learning problem. You could use any machine learning technique. You might start with Naive Bayes.

If you want to evaluate a single feature, you could use information gain or BIC.

For the combination of all features, you can use a machine learning algorithm. As I mentioned, I would suggest trying Naive Bayes first. If you need something more powerful, there are many other classifiers: random forests, SVM's, k-nearest neighbors. Read a textbook on machine learning to learn more about the subject.

Navigate

I have two models and first model has a property ICollection. I have one to many

I have two of them. I\'m thinking that they are pretty similar,unfortunately I c

I have two non-overlapping sets of items, with feature counts for each. What sta

Question

Explanation / Answer

Related Questions

Navigate