Preview
Issue #2
- Deleted Account
- And then the structure of hand-formatted text changes and your heuristic fails:
https://instantview.telegram.org/contest/vesma.today/template25/?url=http%3A%2F%2Fvesma.today%2Farticle%2Fpost%2F247-mnenie-o-formalnykh-vyborakh-s-ocherednoy-zamanukhoy-
Or grabs the author name together with rudimentary text:
https://instantview.telegram.org/contest/vesma.today/template25/?url=http%3A%2F%2Fvesma.today%2Farticle%2Fpost%2F267-istoriya-ne-terpit-suety-magadanskomu-teatru-ispolnilos-80-let-
Of course, now when I found these failing cases for you, you can fix them too, basically by special-casing them. But only until next time. Extracting values from hand-formatted text on a website that's going to be updated just is not practical.
- Declined by admin
- Not safely/reliably identifiable. It's fine to leave them like this.
- Type of issue
- IV page is missing essential content
- Reported
- Feb 14, 2019
Trust me, it's possible to identify the author on pages like this. Check my template.