資料探勘資料集下載資源

Thinkgamer_gyt發表於2015-08-18

在網上看到很好的資源收集,分享給大家:

1、氣候監測資料集 http://cdiac.ornl.gov/ftp/ndp026b

2、幾個實用的測試資料集下載的網站

http://www.fs.fed.us/fire/fuelman/

http://www.cs.toronto.edu/~roweis/data.html
http://www.cs.toronto.edu/~roweis/data.html
http://kdd.ics.uci.edu/summary.task.type.html
http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/
http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/
http://www.phys.uni.torun.pl/~duch/software.html
在下面的網址可以找到reuters資料集:http://www.research.att.com/~lewis/reuters21578.html
該網址有各種資料集:http://kdd.ics.uci.edu/summary.data.type.html
進行文字分類,還有一個資料集是可以用的,即rainbow的資料集
http://www-2.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html

3、UCI收集的機器學習資料集
ftp://pami.sjtu.edu.cn/
http://www.ics.uci.edu/~mlearn//MLRepository.htm

4、statlib
http://liama.ia.ac.cn/SCILAB/scilabindexgb.htm
http://lib.stat.cmu.edu/

5、關於基金的資料探勘的網站
http://www.gotofund.com/index.asp

http://lans.ece.utexas.edu/~strehl/

6、進行文字分類&WEB
http://www-2.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html

http://www.w3.org/TR/WD-logfile-960221.html
http://www.w3.org/Daemon/User/Config/Logging.html#AccessLog
http://www.w3.org/1998/11/05/WC-workshop/Papers/bala2.html
http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/
http://www.web-caching.com/traces-logs.html
http://www-2.cs.cmu.edu/webkb
http://www.cs.auc.dk/research/DP/tdb/TimeCenter/TimeCenterPublications/TR-75.pdf
http://www.cs.cornell.edu/projects/kddcup/index.html

7、時間序列資料的網址
http://www.stat.wisc.edu/~reinsel/bjr-data/

8、apriori演算法的測試資料
http://www.almaden.ibm.com/cs/quest/syndata.html

9、資料生成器的連結
http://www.cse.cuhk.edu.hk/~kdd/data_collection.html
http://www.almaden.ibm.com/cs/quest/syndata.html
10、關聯:
http://flow.dl.sourceforge.net/sourceforge/weka/regression-datasets.jar
http://www.almaden.ibm.com/software/quest/Resources/datasets/syndata.html#assocSynData

11、WEKA:
http://flow.dl.sourceforge.net/sourceforge/weka/regression-datasets.jar
1。A jarfile containing 37 classification problems, originally obtained from the UCI repository
http://prdownloads.sourceforge.net/weka/datasets-UCI.jar
2。A jarfile containing 37 regression problems, obtained from various sources
http://prdownloads.sourceforge.net/weka/datasets-numeric.jar
3。A jarfile containing 30 regression datasets collected by Luis Torgo
http://prdownloads.sourceforge.net/weka/regression-datasets.jar

12、癌症基因:
http://www.broad.mit.edu/cgi-bin/cancer/datasets.cgi

13、金融資料:
http://lisp.vse.cz/pkdd99/Challenge/chall.htm

14、一個很好的資源網址為:http://kdd.ics.uci.edu/,裡面包含的資料資源按應用領域劃分的。

相關文章