COMP2741/8714 : Utility Progress Programming Venture YouTube Trender
5pm, Thursday, Week 7
You do NOT need a YouTube/Google account to complete and acquire a extreme mark for this process. You may solely be required to study YouTube video data from a file.
You’ll be able to do that process in pairs. You’ll need to let me know who you could be pairing with and only one submission is required.
The duty is value 20% of the final topic mark. The 20% is broken down into three phases as talked about underneath.
The duty is to design, implement, verify, and doc a command line utility to analyse the outcomes of a YouTube search using the YouTube Info API: e.g. “Trending Topics” of YouTube films.
YouTube is a world video-sharing website online headquartered in San Bruno, California, United States. The placement permits prospects so as to add, view, cost, share, and contact upon films. Obtainable content material materials consists of video clips, TV clips, music films, movie trailers, and completely different content material materials resembling video working a weblog, transient distinctive films, and tutorial films. Individual can also add a title and description to the flicks and by inspecting the contents of and detecting which phrases appear typically all through titles and descriptions we’re in a position to detect “trending issues.” See underneath for additional information on YouTube and the YouTube Info API that generate the guidelines of flicks used on this process:
Wikipedia on Youtube: https://en.wikipedia.org/wiki/YouTube
YouTube Info API: https://builders.google.com/youtube/v3/getting-started
YouTube Info API for itemizing films: https://builders.google.com/youtube/v3/docs/films/guidelines
It is your job to develop a command line utility which will detect the YouTube Trending Topics.
Venture Phases Overview
The duty is broken down into three phases:
1. Parse a YouTube video data string in proper right into a YouTube video object [40%]
2. Form Video objects by completely completely different choices (e.g. title, channel title, views, date, and so forth.) [20%]
three. Index the guidelines of flicks for phrase utilization, aka “Trending Topics” [20%]
You’ll need to moreover submit a doc that describes what you have bought completed and any additional choices that you’ve constructed to your utility. This doc could be submitted by means of FLO
4. Utility description doc [20%]
At each half it’s best to lengthen on the command line interface to indicate the outcomes of your enchancment and allow the individual to work along with the underlying data constructions and algorithms.
Each half might have marks allotted as follows:
• 70% Correctness
• 15% In Code Documentation
• 15% Demonstration and Testing Code
Your submission could be a zipper of your IntelliJ problem itemizing and a PDF of your Utility Description doc
In Code Documentation
Your code must be documented. That is, it should comprise Javadoc suggestions that describe methods, classes, event and class variables, and so forth., which you have bought written.
Use Javadoc (see http://docs.oracle.com/javase/eight/docs/technotes/guides/javadoc/index.html).
Half 1 Detailed Description [up to 40%]
The aim of half 1 is to parse a YouTube data string in proper right into a Video object. You may have to in any case:
• Create a Java class referred to as YouTubeVideo to suggest the contents of a video (minimal of):
o channelId, channelTitle, publishedAt (date), title, description, viewCount
• Create a Java class referred to as YouTubeDataParser which will take a file determine and extract a listing of flicks from the given JSON file
• Create a Java class referred to as YouTubeDataParserException that extends Exception to level a parsing error. The YouTubeDataParser should throw this exception when it encounters an error all through the parsing of a JSON file (e.g. the file not found, malformed JSON file)
• Doc the code using Javadoc.
Half 2 Detailed Description [up to 20%]
Half 2 is to sort Motion pictures by completely completely different choices of your newly created YouTubeVideo object (e.g.
Channel, Date, and so forth.). You may have to in any case be succesful to sort films primarily based totally on
• the channel title,
• the printed at date,
• the number of views,
• one factor to do with the define of the video (e.g. its measurement).
That you must use the static methods of the utility class Collections to sort your films. Add Javadoc to your code to doc the added efficiency.
Half three Detailed Description [up to 20%]
The responsibility of half three is to implement a trending topic analyser for the guidelines of flicks. This portions to indexing the phrase utilization all through the title and description of the flicks. A straightforward strategy to do this is to rely the number of situations a single phrase appears inside the guidelines of flicks. lists a pseudo code occasion for indexing Decide 1 the phrases utilized in films.
SomeCollection phrases (this will very properly be a set, guidelines, map, and so forth)
for each video
for each phrase in video(title and description)
if (phrase is in phrases)
increment rely for phrase in phrases else add phrase to phrases and set rely to 1 end
affiliate video with phrase
Decide 1. Pseudo code for indexing the phrases in films
NOTE: on this occasion the “phrase” simply is not cleaned in anyway (e.g. all punctuation symbols are included inside the “phrase”, “twinkle,” is NOT the equivalent as “twinkle”) and it counts quite a lot of occurrences of a phrase in a title and description. Your indexer ought to satisfy this specification.
As quickly as listed, you wish to have the power to
• quickly uncover a phrase
• uncover the rely associated to a phrase,
• uncover the entire films that use that phrase
• uncover the phrase that is used primarily probably the most
• create a listing of phrases sorted by their counts
You may have to make use of classes from the Collections framework resembling Lists, Maps and Models to implement this algorithm. You might must create a model new class to retailer the small print a few phrase, e.g. the phrase itself, the rely, the flicks associated to it.
Testing your Utility
It is anticipated that you’re going to fully verify your utility by rising code that demonstrates how the teachings you have bought created work. The exams will verify the teachings you assemble to suggest the flicks and completely different objects of information (e.g. phrases inside the index, comparators, lists) and the efficiency associated to them (sorting, uncover most used phrase, getting the title of a video).
You are equipped with a set of JSON recordsdata as part of the skeleton code inside the “data” itemizing.
Your process could be partially assessed as to how properly you present that it actually works.
You’ve got bought been supplied with quite a lot of .json recordsdata as part of the skeleton code that comprise completely completely different examples of the form of data returned from YouTube Info API query. The desk underneath describes each file and the number of films contained inside them
File determine Description No. Motion pictures
youtubedata.json A single video objects in video query (mock data) 1
youtubedata_15_50.json 50 video objects from a query from class 15 50
youtubedata_loremipsum.json 10 mock video objects with useful descriptions and titles 10
youtudedata_indextest.json A single video merchandise for testing indexing 1
youtubedata_singleitem.json A single video merchandise not in a listing 1
youtubedata_malformed.json A single video objects in video query (mock data), nonetheless with an error 1
The file “youtubedata_malformed.json” has one video merchandise in a listing nonetheless incorporates an error. This file is useful for verify the detection of errors in your parser. As an example, it’d throw an exception indicating a difficulty with the supplied .json file.
The file “youtubedata_singleitem.json” incorporates that data for a single video nonetheless not in a listing. That’s applicable for testing a class that signify video merchandise data. The data is a JSONObject (enclosed between “”).
The “youtudedata_indextest.json” incorporates a fake title and description for testing your indexing. There must be 1 “ONE”, 2 “TWO”’s, three “THREE”’s, four “FOUR”’s, and 5 “FIVE”’s.
Along with the .json recordsdata are corresponding “_info.txt” recordsdata accessible on FLO. These recordsdata comprise particulars in regards to the video contained inside the .json file and could be utilized to produce information to your testing (e.g. the number of films). That is, given the .json file it’s best to rely in your program to have the power to current the information contained inside the “_info.txt” file.
The sorting and phrase counts inside the “_info.txt” recordsdata intently adjust to the algorithm inside the process specification. Your implementation ought to conform that what has been specified.
• The phrase rely consists of repeats of the phrase inside the textual content material of 1 message o e.g. textual content material:“aaa aaa” = rely for “aaa” is 2
I am going to look at the code for correctness, nonetheless unit exams will give me additional notion to how your code works and make it easier to mark.
Utility description doc [up to 10%]
That you must write a brief doc to elucidate the choices of your utility. It is to be submitted via FLO as a PDF file. You presumably can embody textual content material from this doc and nonetheless it is not a reproduction of your Javadoc.
This doc may very well be thought to be an informal individual data and will embody show display footage of the code and its output. You presumably can describe the demonstration and verify code.
If it is not submitted, you then’ll acquire zero marks for this half. Will most likely be powerful to understand a HD with out this doc.
Minimal Class Diagram Occasion
The UML diagrams in Decide 2 current a minimal set of classes that you might be implement to complete the duty. They’re in no way prescriptive and are confirmed as data to what you will have to do to complete the duty.
Decide 2. Minimal Class Diagram Occasion
How To: Get Started
A skeleton NetBeans problem is obtainable to get you started. You presumably can acquire a zipper archive from the
Regular block on the FLO topic net web page. As quickly as downloaded, unzip after which open up the problem in NetBeans and flick via the supplied recordsdata. You are equipped with the sample JSON recordsdata and the associated _info.txt recordsdata (inside the data dir), the JSONP library (.jar file inside the lib dir), and a single class referred to as YouTubeTrender. That’s your home to start and incorporates a sample verify method it’s worthwhile to use as a data.
How To: Get YouTube Info
That’s an FYI, you do NOT must get your particular person YouTube data
For this process, we’re going to use the outcomes of a YouTube Info API query over films as our YouTube video data string. The YouTube data API requires OAuth authentication for any queries over the flicks and as such requires a developer to have a Google account and the required OAuth token.
For this process you don’t need a Google account or be succesful to query the YouTube servers to complete the duty to a extreme diploma (e.g. get an HD). You are equipped with sample query outcomes.
The data supplied to you comes from a query of flicks from the popular films from the category of 15 which is “Pets & Animals” with 50 films retrieved, for example
Moving into this into an internet browser will result in an error because it is not OAuth authenticated. Nonetheless, it’s worthwhile to use the YouTube Info “attempt it” facility to strive it out.
Appropriately executed, the information returned is in a format nice to our utility which can merely look at and parse the return of a search and may very well be saved to a file. This format is named the JSON format. Particulars could be given underneath on one of the best ways to look at and parse a JSON formatted data.
To make enchancment and testing easier, you have bought been equipped with some .json recordsdata that comprise the outcomes of queries. This suggests it’s possible you’ll verify your utility with out a group connection or a Google Account.
NOTE: you do not wish to get data from the YouTube servers to complete the duty or acquire a HD
How To: Parse JSON data
To look at or parse the information contained contained in the outcomes of a YouTube data query we would prefer to have the power to parse the JSON data object. JSON , or Java Script Object Notation, is a lightweight datainterchange format. It is simple for individuals to study and write. It is simple for machines to parse and generate. JSON is constructed on two constructions
• A gaggle of determine/price pairs. In various languages, that’s realized as an object, doc, struct, dictionary, hash desk, keyed guidelines, or associative array.
o Denoted by “”, e.g.
• An ordered guidelines of values. In most languages, that’s realized as an array, vector, guidelines, or sequence.
o Denoted by “[“ and “]”, e.g.
We might try to parse the JSON format ourselves, nonetheless there exist fairly just a few APIs for doing this (there are 20 listed on the JSON website online just for Java). You’ve got bought been equipped with the JSONP library file (javax.json-1.zero.4.jar) as part of your skeleton code (inside the lib itemizing) for parsing JSON data. We could be using the merchandise model part of the library and documentation for JSONP may very well be found at:
Javadoc API documentation for this can be found at:
There are a number of interfaces and classes inside the API nonetheless you could be primarily enthusiastic about
• JsonObject (extends java.util.Map) o string and value pairs contained between “”
• JsonArray (extends java.util.Guidelines) o order guidelines of values between “[“ and “]”
as these are the first constructions contained inside the Reddit Points Itemizing.
The JSONP API has varied methods for returning the correct type, e.g
snippet.get(“title”) returns the id topic as String, however when the value is empty this might set off an exception. Due to this to utilize the API appropriately it is important usually take a look at what the form of the contents is first, that is
JsonValue.ValueType price = snippet.get(“title”); if (price.getValueType() == JsonValue.ValueType.NULL)
So, what does a YouTube Info query result in JSON format appear as if? Decide 4 is an occasion that has been truncated to 1 search finish consequence for readability. Decide three lists a portion of main Java code to extract the flicks from data. That fields of the video are listed proper right here:
Decide three. Elementary code to study video from a YouTube query
Decide 4. Development of a YouTube Info JSON Video Search Query End result. “…” signifies truncated data