This assignment is about the review of commonly used data mining tools. The list
ID: 3601154 • Letter: T
Question
This assignment is about the review of commonly used data mining tools. The list of packages affiliated with data mining is exhaustive, the most popular statistical software packages (SAS, Statistica Data Miner, and SPSS) provide the broadest range of feature with remarkably similar modeling and interface approach, whereas the other packages all have their special sets of features and specific target audiences whom we believe each of the packages will server well. This is on the side of Commercial Off-The-Shelf-Software. Sometimes the high-level data mining platform is not enough for a particular mining task, and data scientists need to go to a lower level statistics/programming language compiled and scripting (Java, Python, R, etc.).
Your assignment is to:
Conduct a survey about software tools/packages that might be used for data mining, and write a paper to compare and contrast THREE of them in the areas of data mining.
In your paper, try to answer the question: “why this tool is the best I can use to solve my data mining problem?”, cite the sources properly. Feel free to go beyond.
Your paper should be no less than 3 full pages using the attached template (10pts times new roman font, single line space, double-column, etc.), including proper references.
Submissions don’t meet the above requirements will not be graded.
A review paper about data mining software packages is also included for your reference.
Explanation / Answer
In this age of information Technology. Data is very valuable thing because we can do many thing with this data.And it is getting increase exponentialy day by day. Most of the time we do not have structured data and we deal with unstructured data So task of extracting valuable data or information from this unstructured data we used data mining technology.Our task of mining valuable data and represent it in understandable form is main motto of data mining.
So we have a list of few powerful data mining tools listed as:-
NLTK
Orange
Scrapy
Weka
R
Oracle data mining
KNIME
RapidMiner
Python
As per problem statement i am going to dicuss about most popular data mining tools:-
Orange, Weka,RapidMiner
RapidMiner:- This is top ranked data mining tool .It feature includes it is written in java.And it offers template based framwork for advanced analytics.And main advantage is user do not to worry about coding like stuff because it offers many template as a service and hardly user will face any coding work.
it provides statistical modeling,data visulization and many more functionality and it provides schemes(Learning Schemes) from other available tools like Weka.
Weka:-There are java based and non java version available of weka where java based version is very popular because of it is simple to use.java version is used in many application like predictive modeling ,data visualization,data modeling and supported many other feature as well.one of the best support provided by weka is you can customize tool any time . So if your requirement is to customize the tool so prefer weka other than Rapidminer and other available data mining software.And weka is free under GNU (General public licence).it provide support for regression ,classification, clustering , modeling and many more feature.
Orange;-Orange is written in python while Rapidminer ,weka was written in Java.it is also powerful open source tool.If you have less need of custom implementation in your product it is best choice for this regarding .it supports machine learning and user add one. So finally if you have less need of custom
implementation for your software product you can choose orange.
------------------------
i have explained various most popular data mining tools in case any doubt or further explanation please let me know in comments. Thanks,