語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Methods for Extracting Data from the...
~
Willers, Joel.
Methods for Extracting Data from the Internet.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
Methods for Extracting Data from the Internet./
作者:
Willers, Joel.
面頁冊數:
1 online resource (102 pages)
附註:
Source: Masters Abstracts International, Volume: 57-02.
Contained By:
Masters Abstracts International57-02(E).
標題:
Sociology. -
電子資源:
click for full text (PQDT)
ISBN:
9780355337754
Methods for Extracting Data from the Internet.
Willers, Joel.
Methods for Extracting Data from the Internet.
- 1 online resource (102 pages)
Source: Masters Abstracts International, Volume: 57-02.
Thesis (M.S.)
Includes bibliographical references
The advent of the Internet has yielded exciting new opportunities for the collection of large amounts of structured and unstructured social scientific data. This thesis describes two such methods for harvesting data from websites and web services: web-scraping and connecting to an application programming interface (API). I describe the development and implementation of tools for each of these methods. In my review of the two related, yet distinct data collection methods, I provide concrete examples of each. To illustrate the first method, 'scraping' data from publicly available data repositories (specifically the Google Books Ngram Corpus), I developed a tool and made it available to the public on a web site. The Google Books Ngram Corpus contains groups of words used in millions of books that were digitized and catalogued. The corpus has been made available for public use, but in current form, accessing the data is tedious, time consuming and error prone. For the second method, utilizing an API from a web service (specifically the Twitter Streaming API), I used a code library and the R programming language to develop a program that connects to the Twitter API to collect public posts known as tweets. I review prior studies that have used these data, after which, I report results from a case study involving references to countries. The relative prestige of nations are compared based on the frequency of mentions in English literature and mentions in tweets.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018
Mode of access: World Wide Web
ISBN: 9780355337754Subjects--Topical Terms:
551705
Sociology.
Index Terms--Genre/Form:
554714
Electronic books.
Methods for Extracting Data from the Internet.
LDR
:02703ntm a2200349Ki 4500
001
909874
005
20180426091049.5
006
m o u
007
cr mn||||a|a||
008
190606s2017 xx obm 000 0 eng d
020
$a
9780355337754
035
$a
(MiAaPQ)AAI10605965
035
$a
(MiAaPQ)iastate:16725
035
$a
AAI10605965
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
099
$a
TUL
$f
hyy
$c
available through World Wide Web
100
1
$a
Willers, Joel.
$3
1180865
245
1 0
$a
Methods for Extracting Data from the Internet.
264
0
$c
2017
300
$a
1 online resource (102 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Masters Abstracts International, Volume: 57-02.
500
$a
Adviser: Shawn Dorius.
502
$a
Thesis (M.S.)
$c
Iowa State University
$d
2017.
504
$a
Includes bibliographical references
520
$a
The advent of the Internet has yielded exciting new opportunities for the collection of large amounts of structured and unstructured social scientific data. This thesis describes two such methods for harvesting data from websites and web services: web-scraping and connecting to an application programming interface (API). I describe the development and implementation of tools for each of these methods. In my review of the two related, yet distinct data collection methods, I provide concrete examples of each. To illustrate the first method, 'scraping' data from publicly available data repositories (specifically the Google Books Ngram Corpus), I developed a tool and made it available to the public on a web site. The Google Books Ngram Corpus contains groups of words used in millions of books that were digitized and catalogued. The corpus has been made available for public use, but in current form, accessing the data is tedious, time consuming and error prone. For the second method, utilizing an API from a web service (specifically the Twitter Streaming API), I used a code library and the R programming language to develop a program that connects to the Twitter API to collect public posts known as tweets. I review prior studies that have used these data, after which, I report results from a case study involving references to countries. The relative prestige of nations are compared based on the frequency of mentions in English literature and mentions in tweets.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2018
538
$a
Mode of access: World Wide Web
650
4
$a
Sociology.
$3
551705
650
4
$a
Web studies.
$3
1148502
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0626
690
$a
0646
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
710
2
$a
Iowa State University.
$b
Sociology.
$3
1180866
773
0
$t
Masters Abstracts International
$g
57-02(E).
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10605965
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入