http://blog.csdn.net/pipisorry/article/details/43089121
机器学习的来源和用例:
Machine Learning
- Grew out of work in AI
- New capability for computers
Examples:
- Database mining
Large datasets from growth of automation/web.
E.g., Web click data, medical records, biology, engineering
- Applications can’t program by hand.
E.g., Autonomous helicopter, handwriting recognition, most of
Natural Language Processing (NLP), Computer Vision.
机器学习的定义Machine Learning definition
Arthur Samuel (1959). Machine Learning:
Field of study that gives computers the ability to learn without being explicitly programmed.
Tom Mitchell (1998) Well-posed Learning Problem:
A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.
一个计算机程序从与一些任务T还有一些性能指标P相关的经验中学习,如果用性能度量P测定在任务T上性能,则通过经验E来提高性能度量.
例子:Suppose your email program watches which emails you do or do not mark as spam, and based on that learns how to better filter spam. What is the task T in this setting?
Classifying emails as spam or not spam. Task
Watching you label emails as spam or not spam. Experence
The number (or fraction) of emails correctly classified as spam/not spam. Performance
None of the above—this is not a machine learning problem.
这个例子就是说program通过你label垃圾邮件来学习,完成垃圾邮件classifing的任务,并不断通过学习来提高performance.
机器学习算法Machine learning algorithms
- Supervised learning监督学习
- Unsupervised learning非监督学习
Others: Reinforcement learning, recommender systems.
监督学习Supervised Learning
Supervised Learning:
“right answers” given,给出训练数据{(size in feet2, price in 1000)的数据集}正确的值(这里是Price)
Regression:
Predict continuous valued output (price)
回归的例子:
Classification
Discrete valued output (0 or 1)
分类的例子1(1个feature):
分类的例子2(2个feature右边是更多的feature的例子):
区分分类和回归的例子:
You’re running a company, and you want to develop learning algorithms to address each of two problems.
Problem 1: You have a large inventory of identical items. You want to predict how many of these items will sell over the next 3 months.
Problem 2: You’d like software to examine individual customer accounts, and for each account decide if it has been hacked/compromised.
Question:Should you treat these as classification or as regression problems?
1 Treat both as classification problems.
2 Treat problem 1 as a classification problem, problem 2 as a regression problem.
3 Treat problem 1 as a regression problem, problem 2 as a classification problem.
4 Treat both as regression problems.
Answer:3 is right.
For problem one, I would treat this as a regression problem, because if I have, you know, thousands of items, well, I would probably just treat this as a real value,as a continuous value. And treat, therefore, the number of items I sell,as a continuous value.
And for the second problem, I would treat that as a classification problem, because I might say, set the value I want to predict with zero, to denote the account has not been hacked. And set the value one to denote an account that has been hacked into.
非监督学习Unsupervised Learning
not giving the algorithm the right answer for the examples in my data set.
聚类Clustering
例子:
So this is Unsupervised Learning because we‘re not telling the algorithm in advance that these are type 1 people, those are type 2 persons, those are type 3 persons and so on and instead what were saying is yeah here‘s a bunch of data.
from:http://blog.csdn.net/pipisorry/article/details/43089121
MachineLearning - Introduction (Week 1)
原文地址:http://blog.csdn.net/pipisorry/article/details/43089121