码迷,mamicode.com
首页 > 其他好文 > 详细

ECON4016 - FINAL EXAM

时间:2019-05-12 19:43:40      阅读:129      评论:0      收藏:0      [点我收藏+]

标签:https   BMI   pow   under   doc   span   dice   har   ica   


ECON4016 - FINAL EXAM
The final exam consists 4 small projects. You can choose 2 of them to finish and send me
your report. For each of the small projects you choose, you should perform data analysis
using the data I provide to you and the techniques we discussed in class. For each project,
you should tell me in your report what kinds of questions you were trying to answer with
those analysis, and what did you do and what finding to you have. Please send your report
and R-script to my email (colonct@gmail.com) by 9AM 18 May, 2019. NO late submission
after will be considered.
1. Please download the subsample data of Hong Kong census (2001-2016) from the following
link:
https://drive.google.com/file/d/1Md6c5J0VcV0_g_veL48i9upJNOKoLc-V/view?usp=
sharing
The zipped file contains four data files: hkcensus2001025.dta, hkcensus2006025.dta,
hkcensus2011025.dta, hkcensus2016025.dta which are the subsample data for HK census
in 2001, 2006, 2011 and 2016 respectively. You could use the following code to read
these dta files into R
library(foreign)
mydata <- read.dta(”c:/mydatga.dta”)
Select two to three variables that interested you from these dataset, try to demonstrate
the relationship between these variables using visualisation. You can also extend your
analysis by showing the temporal changes and/or spatial distribution of these variables.
Examples for the research questions are:

代写ECON4016作业、代做R编程设计作业、data留学生作业代写
1) How gender inequality in employment changes by the rising education level of
women?
2) How poor households distribute spatially in different districts of Hong Kong,
and how does this spatial pattern change over time?
These are only examples, feel free to choose other research questions that interests you.
2. Please download the data for textual data analysis from the following link:
https://drive.google.com/file/d/1vmqN5wsUYvAq0yzdpHud32jbB4EKndTD/view?usp=
sharing
It contains two datasets, both in .csv format:
1) historical news headlines from Reddit WorldNews Channel which collected
the top 25 headlines in each date based on reddit users’ votes (RedditNews.csv
contains two columns: the first column is the ”date”, and second column is the
”news headlines”. All news are ranked from top to bottom based on how hot they
are)
1
2) Dow Jones Industrial Average (DJIA) roughly between 2009 and 2016.
And please use the first dataset to generate some useful indices or variables to summarise
the information in those texts and see if these indices or variables could have
some predictive power for the stock price in the second dataset. (Hint: you can either
use simple regression or more complex machine learning methods to test for the
relationship.)
3. Please download the U.S. patent dataset for network analysis from the following link:
https://drive.google.com/file/d/1qytpbWCkyZNYG4GGHdo-P7OxZjtTYYBV/view?usp=
sharing
It contains two datasets, both in .txt format:
1) acite75 99.txt: all US patent citations for utility patents granted between 1975
and 1999 (the edge file)
2) apat63 99.txt: all utility patents information (the node file)
You can find the data documentation files Cite75 99.txt and pat63 99.txt containing
the detail description of all variables inside.
And please use these dataset to create a citation network for the U.S. patents. Try to
visualise and describe the characteristics of this network and try to find some useful
information from these analysis (e.g. which was the key innovations in this patent
dataset).
4. Please download the data of real estate transactions for building a predictive model
from the following link:
https://drive.google.com/file/d/1T6e6-iy15A9OQZyjsbzWDNkiTiOlHHrW/view?usp=
sharing
The link connect to a guangzhou2017.dta file which contains all the real estate transactions
in Guangzhou at 2017. You can use the same code in the first small project
to read this file into R. Please use the apartment characteristics information in this
dataset to build a model for predicting house price using the tree based or neutral
network method.

因为专业,所以值得信赖。如有需要,请加QQ99515681 或邮箱:99515681@qq.com 

微信:codinghelp

ECON4016 - FINAL EXAM

标签:https   BMI   pow   under   doc   span   dice   har   ica   

原文地址:https://www.cnblogs.com/blogy/p/10853311.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!