码迷,mamicode.com
首页 > 其他好文 > 详细

Chapter1——机器学习绪论

时间:2014-12-02 22:12:55      阅读:135      评论:0      收藏:0      [点我收藏+]

标签:style   blog   http   io   ar   color   os   sp   for   

第一章的主要目的是为了了解一下基本概念,如什么是机器学习、无监督学习、监督学习等等。

一、什么是机器学习

1、机器学习是一门新的研究领域,主要是指在不需要显示编程情况下,计算机具有学习的能力

Field of study that gives computers the ability to learn without being explicitly programmed——Arthur Samuel (1959)

2、A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E——Tom Mitchell (1998) 

question:

Suppose your email program watches which emails you do or do not mark as spam, and based on that learns how to better filter spam.  What is the task T in this setting? 

A. Classifying emails as spam or not spam.                T

B. Watching you label emails as spam or not spam.     E

C. The number (or fraction) of emails correctly classified as spam/not spam.         P

D. None of the above—this is not a machine learning problem.

二、机器学习算法

1、Supervised learning

2、Unsupervised learning

3、Reinforcement learning

4、Recommender system

三、Supervised learning

有监督学习的特点:样本是有标签的

1、回归问题:预测给定样本(测试样本)的输出值

2、分类问题:分类出给定样本(测试样本)的标签,如:肿瘤问题,1表示肿瘤是恶性的,0表示良性

question:

Problem 1: You have a large inventory of identical items.  You want to predict how many of these items will sell over the next 3 months.

Problem 2: You’d like software to examine individual customer accounts, and for each account decide if it has been hacked/compromised.

Should you treat these as classification or as regression problems? 

 

A. Treat both as classification problems. 

B. Treat problem 1 as a classification problem, problem 2 as a regression problem. 

C. Treat problem 1 as a regression problem, problem 2 as a classification problem. 

D. Treat both as regression problems. 

四、Unsupervised learning

无监督学习的特点:样本没有标签,如下图,聚类是经典的无监督学习

 

bubuko.com,布布扣

bubuko.com,布布扣

question:

which would you address using an unsupervised learning algorithm?  

A. Given email labeled as spam/not spam, learn a spam filter.

B. Given a set of news articles found on the web, group them into set of articles about the same story. 

C. Given a database of customer data, automatically discover market segments and group customers into different market segments. 

D. Given a dataset of patients diagnosed as either having diabetes or not, learn to classify new patients as having diabetes or not. 

 

Chapter1——机器学习绪论

标签:style   blog   http   io   ar   color   os   sp   for   

原文地址:http://www.cnblogs.com/fjndlsh/p/4138542.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!