Long time coder here with a STEM degree, but not a CS degree. I’ve mostly used programming as a tool to help me with my job rather than as my job, so I’m conscious that I have some gaps in my programming skillset.

To close a couple of those gaps, I’m trying become competent at Github and take my Python skills to the next level.

If you kind people could provide suggestions on improvements I could make to this repo and the code in it I’d be ever so grateful. :)

It’s a bot to run Lemmy posts and comments through the pretrained Detoxify transformer model and to report toxic comments for the Mods to action.

  • UlrikHDA
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    1 year ago

    Obviously this is opinionated and I won’t pretend it’s the only correct way, but a few things that stood out to me was.

    • inconsistent use of type hinting. You type hint the “elem” arg for process_content and nothing else. Personally I use type hints religiously, but at the very least I would type hint every arg. The type may be obvious to you now, but it may not in 6 months, or for others who want to contribute.

    • while on the topics of type hints, you use “#” to comment the purpose of each function, but you really should use docstrings instead. Text editors supporting python will then use the docstrings to show users the description of each function without you having to jump to the declaration to read the description. It’s particularly useful when you got multiple modules. For some IDEs like pycharm, the same format works on variables too.

    • You should wrap up your bottom infinite loop in if __name__ == '__main__': to avoid getting locked if you down the line want to reuse the class/module and import it into another file.

    And the most opinionated point of them all:

    • I would recommend running a linter like pylint to warn about potential code smells. E.g. you’re redefining the python built-in “id”, no exception types are specified in your try blocks, too many branches and statements in process_content() which would probably benefit from being segmented into smaller functions, lines that are twice as long as the recommended length, wrong import order, etc… (these are purely pylint feedback)

    I assume the setup is the same with GitHub’s ci, but with GitLab you can automate pylint to check the the code with this:

      image: python:3.10
      script:
        - pip install pylint
        - pylint *folder*```