Selenium script to delete comments and submissions on voat

1    12 Feb 2019 22:42 by u/aGameCalledCountries

On mac, pip3 install iso8601, pip3 install selenium, brew cask install chromedriver.

This script should work on windows and linux also, but I'm not tech support.

This is a repost of this script, because I ran this script after posting the last one.

#!/usr/bin/env python3
"""
Module for running selenium tests on a web page
"""
import time
from datetime import datetime, timezone, timedelta
# run `pip3 install iso8601` if you get an error about missing this library
import iso8601
# run `pip3 install selenium` if you get an error about missing this library
from selenium import webdriver
# Add your login credentials for voat here
USERNAME = ""
PASS = ""
# List of subs you do not want your content deleted from
# e.g. EXCLUDED_SUBS = ["TheBible", "HardPreaching"]
EXCLUDED_SUBS = []
# Length of time to keep your posts.  When this script is run,
# comments and submissions newer than the entered duration will
# not be deleted. Available options are arguments to timedelta
# object.
# See here: https://docs.python.org/3/library/datetime.html#datetime.timedelta
TIME_DELTA = {"days": 10}
def login(d):
    body = d.find_element_by_tag_name("body")
    body.click()
    c1 = d.find_element_by_id("container")
    c = c1.find_element_by_id("loginForm")
    for l in c.find_elements_by_tag_name("label"):
        try:
            l.click()
        except:
            pass
    c.find_element_by_id("UserName").send_keys(USERNAME)
    c.find_element_by_id("Password").send_keys(PASS)
    for btn in c.find_elements_by_tag_name("input"):
        if btn.get_attribute("value") == "Log in":
            btn.click()
            break
    time.sleep(1)
def delete_submissions(d):
    cw = driver.current_window_handle
    while True:
        c = d.find_element_by_id("container")
        try:
            for sub in c.find_elements_by_class_name("submission"):
                try:
                    comments = sub.find_element_by_class_name("comments")
                    href = comments.get_attribute("href")
                    print(href)
                    for s in EXCLUDED_SUBS:
                        if s in href:
                            print("Skipping submission %s" % href)
                            continue
                    t = sub.find_element_by_tag_name("time").get_attribute("datetime")
                    if datetime.now(timezone.utc) - iso8601.parse_date(t) < timedelta(
                        **TIME_DELTA
                    ):
                        print("Skipping newer submission %s" % href)
                        continue
                    d.execute_script("window.open('about:blank', 'link');")
                    d.switch_to.window("link")
                    d.get(href)
                    time.sleep(1)
                    ent = d.find_element_by_class_name("entry")
                    ent.find_element_by_link_text("delete").click()
                    ent.find_element_by_link_text("yes").click()
                    # time.sleep(1)
                    d.close()
                    d.switch_to.window(cw)
                except Exception as ex:
                    print(ex)
        except Exception as ex:
            print(ex)
        try:
            c = d.find_element_by_id("container")
            nxt = c.find_element_by_link_text("next ›")
            nxt.click()
        except:
            break
def delete_comments(d):
    cw = driver.current_window_handle
    while True:
        c = d.find_element_by_id("container")
        try:
            for comment in c.find_elements_by_class_name("comment"):
                try:
                    l = comment.find_element_by_link_text("permalink")
                    href = l.get_attribute("href")
                    print(href)
                    for s in EXCLUDED_SUBS:
                        if s in href:
                            print("Skipping comment %s" % href)
                            continue
                    t = comment.find_element_by_tag_name("time").get_attribute(
                        "datetime"
                    )
                    if datetime.now(timezone.utc) - iso8601.parse_date(t) < timedelta(
                        **TIME_DELTA
                    ):
                        print("Skipping newer comment %s" % href)
                        continue
                    d.execute_script("window.open('about:blank', 'link');")
                    d.switch_to.window("link")
                    d.get(href)
                    time.sleep(1)
                    cmnt_area = None
                    try:
                        cmnt_area = d.find_element_by_class_name("commentarea")
                    except:
                        d.close()
                        d.switch_to.window(cw)
                        continue
                    ent = cmnt_area.find_element_by_class_name("entry")
                    ent.find_element_by_link_text("...").click()
                    ent.find_element_by_link_text("delete").click()
                    time.sleep(1)
                    ent.find_element_by_link_text("yes").click()
                    time.sleep(1)
                    d.close()
                    d.switch_to.window(cw)
                except Exception as ex:
                    print(ex)
                    continue
        except Exception as ex:
            print(ex)
        try:
            c = d.find_element_by_id("container")
            nxt = c.find_element_by_link_text("next ›")
            nxt.click()
        except:
            break
if __name__ == "__main__":
    page = "https://voat.co/account/login"
    driver_name = "Chrome"
    driver = getattr(webdriver, driver_name)()
    try:
        driver.set_page_load_timeout(10)
        driver.get(page)
        login(driver)
        page = "https://voat.co/u/%s/submissions" % USERNAME
        driver.get(page)
        delete_submissions(driver)
        page = "https://voat.co/u/%s/comments" % USERNAME
        driver.get(page)
        delete_comments(driver)
        time.sleep(3)
    finally:
        driver.close()

13 comments

0

I'm not sure but it might be possible to see deleted comments / posts. It might be better to edit them.

0

it might be possible to see deleted comments / posts

No it's not.

0

Now why the hell would you do all that when you can do a javascript setInterval, jquery select and confirm, with a pause long enough not to fire off the activity filter?

Literally just a few lines of code.

0

Feel free to improve it, but you have to open a new window to delete the submission/comment, fyi.

0

https://api.jquery.com/jquery.get/

Just get into a $ object that doesn't load in the dom, .find("elementID") and manipulate as needed.

Same idea as saying

var myListItem = $("<li/>")

and now you have a list item element that you can do all kinds of things to.

You're still only looking at... less than 30, maybe 40 lines of code. A lot less if you're clever. And firefox dev console has a javascript scratchpad with autocomplete. Don't even need to fire up an IDE.

0

Really I think your approach is crap, and question whether it will actually work. I can run this script on a cron job every day and not have to think about your copy/paste nonsense.

0

He's right though, objectively your solution is worse. Using a bulky webdriver to automate clicks and navigation is crap compared to using raw data and requests.

Login > Scrape history > Get the IDs for the posts you want to delete > Send all the POST requests to delete the comments

Much faster and uses virtually no resources compared to chromedriver. I used to use selenium when I was lazy, but ended up figuring out that doing everything with raw requests was faster and better overall.

0

Voat uses some anti-fraud stuff and the JavaScript on the page needs to run for the form to be submitted. Your/his approach won’t work. You can’t “send all the post requests” because you don’t have the generated tokens just by scraping the html. That is the reason for using selenium, because PuttItOut only gave api access to about 2 accounts.

0

anti-fraud stuff

VoatRequestVerificationToken it just a hidden field, a very old practice. Just make a request and parse it.

You can’t “send all the post requests” because you don’t have the generated tokens just by scraping the html. That is the reason for using selenium, because PuttItOut only gave api access to about 2 accounts.

Handle everything the exact same way your browser does when your logged in as a user. You don't need API or a slow ass webdriver to hold your hand for you.

0

But then you have to have support for python and the selenium browser driver and keeping the version lock-step with your browser version and a few other bits of bullshit that your average programmer isn’t going to want to fuck with unless their dev environment is already set up for that.

Yeah, I know about selenium since I write our company site tests with it. I know how it’s okay but far more complicated than it needs to be.

While we’re on the topic, why not just write a cyprus.io script? Write your selenium tests and scripts in JavaScript and even run it in browserstack.

Point of the matter is that you have an irrational love of your script. It’s mediocre at best, too much of a fuckin pain in the ass realistically. A script like I’m talking about though can be made into a bookmarklet. Want a chron job? Shortcut that bitch and run it. Don’t want a browser windows to have to fuck with? Run it in a headless browsers. Point is that selenium isn’t the most effective or efficient way to do it.

0

I think a good script would be delete any comment or post you saved of your’s after two days or so. That way you can pick exactly what you want to delete.

0

Would be good to have a script that deletes posts / comments that you save that are your's after two days. That way you can set exactly what posts / comment to delete

0

When you delete your voat account it gives you the option to nuke it.