How to Skin A Penguin

How to Skin A Penguin

Recently on my Github account I changed one of my roles from Qualified Cat Herder to Qualified Cat Skinner because I've mastered herding them all into a corner and now it's time to take it to the next level and get a little more medieval on their ass. But it hit me. These cats, a lot of times they're really penguins that are just dressed like cats. Doh!

As you can see ^^ I have a wide range of roles that span the gamut from cleaning up shit to putting in long hours of bending over forward banging my head against the wall trying to figure out WTF is wrong with my life. And then it dawned on me I have to skin the damn penguins so my little daemon buddy Beastie can do his job. And I can get back to scuba diving in Mexico knowing that he's got the power to serve so I don't have to worry about uptime.

I'm kinda a weird one, how I'd rather put my applications in FreeBSD Jails instead of in Linux VM's or in Dockers. A lot of that has to do with habit and it works and a jail basically is a container. I use tools like CBSD, shell scripts, awk, (now some rust), pipes and these things sorta things ">>" to orchestrate applications for repeatable deployments on FreeBSD. Maybe I'm killing myself for no reason and I just just adopt Docker and Kubernetes. But in reality reality simpler is oftentimes easier actually with less parts, bells, whistles and other doohickies that it turns out add new dimensions to security, and stability. I think FreeBSD is the top operating system. He totally bangs Tux on the reg if given the chance.

No alt text provided for this image


Yes I guess cloud native containers are great if you can afford to run a badass Kubernetes cluster with a dedicated bare metal database server(s) over at Amazon. But I don't have that kinda scrib so I do it in a rack in South City Computer's basement on cheap used servers with 32 cores and like 128GB of RAM Hillary Clinton style. There's just something tangible for me being able to have that much bare metal horsepower to play with for my business operations. I don't think though I'm that special of a use case. Most businesses access their applications from the office.

There's really no good reason to put a file server, groupware, team, blah blah blah on the cloud when they could get more bang on a faster network in house. Sure push the backups to a cheap S3 storage provider. Do some routing for the work from home. With traffic shaping and faster and faster network speeds putting workplace applications in a data center on someone else's over-provisioned servers ("The Cloud") is arguably unnecessary for most small businesses.

And so I like doing that with FreeBSD because once it's set up it just goes like a bunny. I'm not saying the Penguin sucks. Linux actually is really super awesome too. But if I can put the application on FreeBSD and jail it and use native ZFS to push snapshots to a local backup box and then incrementals to offsite storage "The Cloud" (ie: in my case to an encrypted drive attached to a Raspberry pie in my mom's closet down in Florida) then why give Amazon the money?! And if you're worried about your electric bill, concerned about greenhouse gasses, and want to dramatically reduce your risk of being hacked by more than half by A-holes overseas who are wide awake when you're asleep, turn them off with a crontab at night when nobody is in the office, restaurant, or in my case computer repair shop using them. Wake on LAN... easy cheesy and done.

Originally my intent for this article was not to spew out my obstinate opinion about why I think wasting money on paying for bandwidth twice is stupid. But I hope it gave you a quick overview of my motivation for skinning Penguins. But some application developers (a lot actually) don't share the opinion about this and develop their applications using Linuxism and Bashisms with seemingly zero F's for guys like me who don't want it in a Docker, who want to deliver it to the customer on the fastest network possible for the best price possible so they can take the money they're not burning in "The Cloud" and pay me instead to make cool shit for them that runs pretty much forever unattended. Again the ultimate goal is scuba diving in Mexico let's not forget.

So enter in this new document management system written in Python that uses a Postgresql database backend and tesseract for OCR with full text search, etc... I want to adopt called Papermerge. My goal is to get rid of all this paper, mail, receipts, invoices, etc and easily/automagically be able to import, catalogue, organize, find, share and secure documents that I use for my business. In the hopes that it improves our ability to help ourselves and then eventually I think help my customers with their pile of disorganized crap on their desks.

No alt text provided for this image

The developer documentation makes zero mention of FreeBSD. There is not a FreeBSD port. But damn this app is slick. From what I can see in for the system requirements all the components it needs are also available for FreeBSD. Nothing is really so special about it. It's a django python application that uses Tesseract for OCR and a Postgresql database (or SQLite for small/non-production getups). There is no reason why this won't work on FreeBSD from what I can tell.

Except having to translate Penguin into Beastie. The subtle differences on things as basic as what the python libraries are called. How I need to alias pip3 to pip, how I need to troubleshoot all the errors about missing this and that. Install gcc, and basically make my FreeBSD box have all the things that are kinda glossed over with the Linux Specific installation instructions in the documentation. Maybe I'm a trailblazer. Maybe this means I get to be the port maintainer. Or maybe I'm just stupid and I should say fuck it and just use Ubuntu and their pre-built docker image. I might still do that if the Penguin won't let me skin it.

The current place I'm at is below. But I think after a nights sleep last night these last penguins I gotta skin might actually be the last ones for me to get Papermerge working in a FreeBSD Jail.


[root@doc papermergeDMS]# python3 ./manage.py runserve
Watching for file changes with StatReloader
Performing system checks...

System check identified some issues:

WARNINGS:
?: Papermerge can't find convert. Without it, image resizing is not possible.
?? ?HINT: Either it's not in your PATH or it's not installed.
?: Papermerge can't find identify. Without it, it is not possible to count pages in TIFF.
?? ?HINT: Either it's not in your PATH or it's not installed.
?: Papermerge can't find pdfinfo. Without it, Papermerge won't function properly. It won't be able to find out PDF files page count.
?? ?HINT: Either it's not in your PATH or it's not installed.
?: Papermerge can't find pdftk. Without it, Papermerge won't be able to cut/paste PDF pages.
?? ?HINT: Either it's not in your PATH or it's not installed.
?: Papermerge can't find pdftoppm. Without it, it not possible to extract images from PDF.
?? ?HINT: Either it's not in your PATH or it's not installed.
?: Papermerge can't find tesseract. Without it, OCR of the documents is impossible.
?? ?HINT: Either it's not in your PATH or it's not installed.
?: papermerge.conf.py file was found. Following locations attempted /etc/papermerge.conf.py, papermerge.conf.py
?? ?HINT: Create one of those files or point PAPERMERGE_CONFIG environment name to it.

System check identified 7 issues (0 silenced).
October 28, 2021 - 22:12:28
Django version 3.0.8, using settings 'config.settings.dev'
Starting development server at https://127.0.0.1:8000/
Quit the server with CONTROL-C.
upload for f=2019_TaxReturn.pdf user=nwheelo
Internal Server Error: /upload/
Traceback (most recent call last):
? File "/usr/local/lib/python3.8/site-packages/django/core/handlers/exception.py", line 34, in inner
??? response = get_response(request)
? File "/usr/local/lib/python3.8/site-packages/django/core/handlers/base.py", line 115, in _get_response
??? response = self.process_exception_by_middleware(e, request)
? File "/usr/local/lib/python3.8/site-packages/django/core/handlers/base.py", line 113, in _get_response
??? response = wrapped_callback(request, *callback_args, **callback_kwargs)
? File "/usr/local/lib/python3.8/site-packages/django/contrib/auth/decorators.py", line 21, in _wrapped_view
??? return view_func(request, *args, **kwargs)
? File "/usr/local/lib/python3.8/site-packages/django/views/generic/base.py", line 71, in view
??? return self.dispatch(request, *args, **kwargs)
? File "/usr/local/lib/python3.8/site-packages/django/views/generic/base.py", line 97, in dispatch
??? return handler(request, *args, **kwargs)
? File "/usr/local/papermergeDMS/papermerge/core/views/documents.py", line 326, in post
??? page_count = get_pagecount(f.temporary_file_path())
? File "/usr/local/lib/python3.8/site-packages/mglib/pdfinfo.py", line 86, in get_pagecount
??? compl = subprocess.run(
? File "/usr/local/lib/python3.8/subprocess.py", line 493, in run
??? with Popen(*popenargs, **kwargs) as process:
? File "/usr/local/lib/python3.8/subprocess.py", line 858, in __init__
??? self._execute_child(args, executable, preexec_fn, close_fds,
? File "/usr/local/lib/python3.8/subprocess.py", line 1704, in _execute_child
??? raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/usr/bin/pdfinfo'
[28/Oct/2021 22:12:57] "POST /upload/ HTTP/1.1" 500 112606

r        

I think the problem now is just that Linux craps up systemspace with userspace and puts everything in /usr/bin whereas FreeBSD's file system hierarchy is very nicely and predictably organized between the OS's stuff and all the other stuff that gets installed. FreeBSD if you don't know has a separate place for you to put your programs and configurations and all the things needed for the application in /usr/local/*. So the problem here is how do I skin the penguin?

  1. The easy way: symlink to the missing things the application thinks lives in /usr/bin, and probably /etc
  2. Port the application so it uses the correct file system paths for FreeBSD.

I'll probably end up just writing my own installer script that takes into account all the Linuxisms and translates them to Beastieisms, creates the missing libraries using symlinks and does some other "magic" to get it working before I volunteer to maintain a FreeBSD port. When I'm done and if it's working maybe I'll share it if anyone is actually interested.

Here is someone smarter than me who explains the why FreeBSD rocks. It's also where I stole the awesome pic.


要查看或添加评论,请登录

Nestor W.的更多文章

社区洞察

其他会员也浏览了