java爬蟲第一天-bug記錄
java爬蟲第一天-bug記錄
注意:實現爬蟲要使用idea自帶的maven。
bug1:
Cannot resolve symbol 'response'
原因:
try {
CloseableHttpResponse response = httpClient.execute(httpGet);
if(response.getStatusLine().getStatusCode() == 200) {
String content = EntityUtils.toString(response.getEntity() , "utf-8");
System.out.println(content.length());
}
} catch (IOException e) {
e.printStackTrace();
}finally {
try {
response.close();
} catch (IOException e) {
e.printStackTrace();
}
try {
httpClient.close();
} catch (IOException e) {
e.printStackTrace();
}
}
解決辦法:
CloseableHttpResponse response = null;
try {
response = httpClient.execute(httpGet);
if(response.getStatusLine().getStatusCode() == 200) {
String content = EntityUtils.toString(response.getEntity() , "utf-8");
System.out.println(content.length());
}
} catch (IOException e) {
e.printStackTrace();
}finally {
try {
response.close();
} catch (IOException e) {
e.printStackTrace();
}
try {
httpClient.close();
} catch (IOException e) {
e.printStackTrace();
}
}
bug2:
org.apache.http.client.ClientProtocolException
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:187)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
at cn.itcast.crawler.test.HttpGetTest.main(HttpGetTest.java:21)
Caused by: org.apache.http.ProtocolException: Target host is not specified
at org.apache.http.impl.conn.DefaultRoutePlanner.determineRoute(DefaultRoutePlanner.java:71)
at org.apache.http.impl.client.InternalHttpClient.determineRoute(InternalHttpClient.java:125)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
... 3 more
Exception in thread "main" java.lang.NullPointerException
at cn.itcast.crawler.test.HttpGetTest.main(HttpGetTest.java:31)
原因:
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpGet httpGet = new HttpGet("");
CloseableHttpResponse response = null;
try {
response = httpClient.execute(httpGet);
if(response.getStatusLine().getStatusCode() == 200) {
String content = EntityUtils.toString(response.getEntity() , "utf-8");
System.out.println(content.length());
}
} catch (IOException e) {
e.printStackTrace();
}
解決辦法:
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpGet httpGet = new HttpGet("http://www.itcast.cn");
CloseableHttpResponse response = null;
try {
response = httpClient.execute(httpGet);
if(response.getStatusLine().getStatusCode() == 200) {
String content = EntityUtils.toString(response.getEntity() , "utf-8");
System.out.println(content.length());
}
} catch (IOException e) {
e.printStackTrace();
}
bug3:註釋掉test仍看不到日誌輸出。
log4j:WARN No appenders could be found for logger (org.apache.http.client.protocol.RequestAddCookies).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
</dependency>
<!-- https://mvnrepository.com/artifact/org.slf4j/slf4j-log4j12 -->
<dependency >
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.25</version>
<!-- <scope>test</scope>-->
</dependency>
解決:
因為1.2版本的要在 log4j.properties 檔案裡配置輸出
# Global logging configuration 這個配置是除錯用的配置,生產環境要改成INFO或更高階別
log4j.rootLogger=DEBUG, stdout
# Console output...
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern= %-d{yyyy-MM-dd HH:mm:ss} [ %t:%r ] - [ %p ] %m%n
原始碼:
package cn.itcast.crawler.test;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
import java.io.IOException;
public class HttpGetTest {
public static void main(String[] args) {
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpGet httpGet = new HttpGet("http://www.itcast.cn");
CloseableHttpResponse response = null;
try {
response = httpClient.execute(httpGet);
if(response.getStatusLine().getStatusCode() == 200) {
String content = EntityUtils.toString(response.getEntity() , "utf-8");
System.out.println(content.length());
}
} catch (IOException e) {
e.printStackTrace();
}finally {
try {
response.close();
} catch (IOException e) {
e.printStackTrace();
}
try {
httpClient.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
相關文章
- 部落格園記錄:汽車引數爬蟲爬蟲
- 記錄JAVA學習的第一天Java
- 使用java 爬蟲Java爬蟲
- Java爬蟲與Python爬蟲的區別?Java爬蟲Python
- Java爬蟲翻頁Java爬蟲
- 記錄自己,來到JAVA論壇第一天Java
- 爬蟲筆記(一)爬蟲筆記
- Java爬蟲批量爬取圖片Java爬蟲
- Java爬蟲-爬取疫苗批次資訊Java爬蟲
- Java 爬蟲專案實戰之爬蟲簡介Java爬蟲
- Python爬蟲與Java爬蟲有何區別?Python爬蟲Java
- 【Python學習】爬蟲爬蟲爬蟲爬蟲~Python爬蟲
- Python爬蟲和java爬蟲哪個效率高Python爬蟲Java
- java 爬蟲大型教程(一)Java爬蟲
- 微博爬蟲 java實現爬蟲Java
- [捉蟲記錄]關於Cascade Training Error的bugAIError
- 3.24 爬蟲小週記爬蟲
- 3.26爬蟲小記爬蟲
- 3.22 爬蟲小記爬蟲
- 使用node爬蟲做了一個vue小專案記錄使用筆記爬蟲Vue筆記
- 爬蟲:多程式爬蟲爬蟲
- IPIDEA乾貨|Java爬蟲與Python爬蟲的區別IdeaJava爬蟲Python
- 前端小bug記錄前端
- 爬蟲學習日記(六)完成第一個爬蟲任務爬蟲
- python爬蟲日記01Python爬蟲
- 爬蟲學習日記(六)爬蟲
- 爬蟲學習日記(八)爬蟲
- 爬蟲學習日記(七)爬蟲
- 爬蟲學習日記(二)爬蟲
- 爬蟲學習日記(一)爬蟲
- 爬蟲學習日記(五)爬蟲
- 爬蟲學習日記(三)爬蟲
- 我的爬蟲筆記(1)爬蟲筆記
- python爬蟲學習記錄之報錯及解決方案Python爬蟲
- 爬蟲記錄——第三方錢包加密引數逆向爬蟲加密
- 一個很垃圾的整站爬取--Java爬蟲Java爬蟲
- 通用爬蟲與聚焦爬蟲爬蟲
- 爬蟲--Scrapy簡易爬蟲爬蟲