Web Scraping & Data Analysis

hellyou發表於2024-10-24

Assignment 1: Web Scraping & Data AnalysisSep 31, 2024

In this assignment, you should work with data from

The Movie Database (TMDb) is a popular platform for movie enthusiasts, offering a vastcollection of movies from all genres and regions. TMDb provides users with detailedinformation such as movie titles, release dates, cast, crew, genres, ratings, and more. It's a goto source for finding information about both classic and upcoming films, as well as the latestin TV shows.Everyone is interested in great movies, but with so many films released each year, how can

we find the best ones? Scraping high-quality data from movie websites is crucial. In thisproject, we will utilize the skills we've learned with requests and regular expressions to scrape essential movie details from The Movie Database (TMDb) website, allowing us

to build a comprehensive dataset for further analysis. Task1. You are required to scrape 200 Movies from the website and save result intoTitle of Movie5 marks

5 marksYou are free to explore data with more properties if needed.

Task2. You are required to do a data analysis on the data. What do you think is interestingabout this data? Tell a story about some interesting thing you have discovered by looking atthe data. (60 marks)

For example, which one is the best movie you might watch? Does the type of movie affectmovie sales? Which category of movies sells the best?Note: This is an open topic project. You are required to provide a novel topic anddemonstrate your hypotheses (view points) with data analysis and figures llustrations. The reports and running code (web scraping + data analysis) should be submitted usingJupter Notebook file.

Submission Checklist: Yes/No Items

Jupyter Notenook codeyour_name+id.csvMarking Guidelines

Marking Criteria Idea (5 marks)

 Presents a novel idea

 Clearly demonstrate your viewpoints.

 Demonstrates good understanding ofthe topic.

Discussion (30 marks)

 Provide convincing arguments toyour viewpoints.

 Backs up arguments with appropriatedata analysis results.

 Visualize data analysis resultsby

 using more than 5 figures.

Organization (20 marks)

 Use of figures to support ideasdiscussed in the report.

 The quality of the figures.

 These figures should be informative.

 Use of sub-titles and/or clear topicsentences.Use multiple visualization methods(line, bar, pie chart, etc, ).

Writing Style (5 marks)  Concise writing styleStrong scientific writing withoutgrammatical errors.

相關文章