Genius: handle lyrics divs with no text #5585
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Fixes #5583.
It's possible for a Genius page to have a lyrics div with no text that will be retrieved by BeautifulSoup's
get_text
method, which causes lyric fetching to crash. Based on the example in #5583 I believe the problem URL was this one: https://genius.com/Ludwig-van-beethoven-symphony-no-5-in-c-minor-op-67-1st-movement-allegro-con-brio-annotatedNote that the "lyrics" are a series of pictures of the pages of the score, so the text retrieved is an empty string. Prior to this fix, when this happened, this loop would remove the last character of the string until it was empty and then attempt to access the last character again and crash:
The intention of that loop seems to be to do what
lyrics.rstrip('\n')
would do. I replaced it withlyrics.rstrip()
to ensure that if we have lyrics consisting only of any whitespace, they'll be turned into an empty string (and this will becomeNone
as in other cases where there are no lyrics to be found).