30. June 2019
100 Best – Hippie- and Protest Songs 2/4
Recap summer songs
It is Sunday again and my favorite radio station radioeins is doing another round of “The 100 best…” This time, it is the 100 best Hippie- and Protest songs.
I did a closer look at the last weeks process of web-scraping and found some peculiarities. The webpage doesn’t always appear to show all jury members. For different html reads I got 109 or 110 different persons, so the results might no be consistent. I should inspect this behavior a little bit more.
Naming Schemes
Another Problem, were different names for the same songs and artists, which means, the final score doesn’t add up in the end. As a next step, I will point out some naming schemes, which lead a false result.
1 – Different versions but the same songs, Live versions
-> get rid of the live indication in a song name, because they will be counted as one.
2 – “The”
Some jury member, used an article for a band, some don’t. When they are sorted alphabetically for the final place, it won’t be considered, no matter what. So I just exclude it from the data. Here is a nice example.
3 – “And” and “&”
Some artist have “And” or “&” in their name. I have to make it more consistent.
See the Fresh Prince:
4 – Special Characters, line breaks
Extracting the information from html, in some cases leads to data, still containing functional character, such as the line break \n
5 – Spaces in general
As you can see in the following example, different spaces appear to be a problem. Double Spaces are followed by the line break in the example above, and sometimes confusing spaces change the song name a hinder a good prediction of the final charts. As I don’t see a good use of spaces, other than better readability, just delete all the space.
Here is an overview of the changes, I made to the Script. This is just swiftly typed down, I will look for a better way to put this in a Script and then upload it on my gitlab account.
Hippie Songs
Following the new insights from recapping last week, I could simply apply my script to this weeks top-100 Hippie song charts. Here is my prediction for the 20 top songs and you can listen to it at radioeins.de.
place | artist | song | score | mentioned | average place |
1 | JohnLennon | Imagine | 245 | 34 | 4.500000 |
2 | JeffersonAirplane | WhiteRabbit | 141 | 21 | 4.761905 |
3 | BuffaloSpringfield | ForWhatIt'sWorth | 93 | 14 | 5.000000 |
4 | Mamas&Papas | CaliforniaDreamin' | 92 | 15 | 5.066667 |
5 | JoeCocker | WithALittleHelpFromMyFriends | 82 | 11 | 3.909091 |
6 | BobDylan | TimesyAreA-Changin' | 72 | 12 | 5.333333 |
7 | JimiHendrixExperience | PurpleHaze | 67 | 10 | 4.800000 |
8 | BobDylan | Blowin'InWind | 63 | 9 | 4.444444 |
9 | JoniMitchell | Woodstock | 60 | 10 | 5.400000 |
10 | BarryMcGuire | EveOfDestruction | 60 | 10 | 5.300000 |
11 | JanisJoplin | MercedesBenz | 59 | 8 | 4.250000 |
12 | Beatles | AllYouNeedIsLove | 59 | 6 | 2.500000 |
13 | JanisJoplin | MeAndBobbyMcGee | 53 | 11 | 6.272727 |
14 | JeffersonAirplane | SomebodyToLove | 53 | 9 | 5.333333 |
15 | Byrds | Turn!Turn!Turn!(ToEverythingreIsASeason) | 53 | 7 | 4.142857 |
16 | GilScott-Heron | RevolutionWillNotBeTelevised | 52 | 8 | 5.125000 |
17 | ScottMcKenzie | SanFrancisco(BeSureToWearFlowersInYourHair) | 49 | 11 | 6.727273 |
18 | PlasticOnoBand | GivePeaceAChance | 49 | 9 | 5.888889 |
19 | RichieHavens | Freedom | 48 | 8 | 5.375000 |
20 | JimiHendrix | Star-SpangledBanner | 45 | 7 | 4.857143 |
21 | EdwinStarr | War | 44 | 8 | 5.875000 |
22 | CreedenceClearwaterRevival | FortunateSon | 44 | 7 | 5.142857 |
23 | JimiHendrixExperience | AllAlongWatchtower | 43 | 8 | 5.750000 |
24 | CannedHeat | GoingUpCountry | 38 | 8 | 6.375000 |
25 | NeilYoung | HeartOfGold | 38 | 7 | 5.714286 |