로그인

검색

paperless-ngx 설치 (RPI4)

paperless-ngx 설치 (RPI4)

 

상황

- 프린트 사용을 줄여 보려고, 시도합니다.

- 라즈베리파이4 에 설치해 봅니다.

 

 

시도와 에러

- portainer 에서 stack 설치가 번번히 실패합니다.

- 찾아보니, 사양이 낮은 기기에서는 DB를 sqlite 을 사용하라고 합니다.

 

 

 

 

성공

- 설치폴더를 만들고, 속성을 777로 합니다.

- 아래 명령어를 실행합니다.

bash -c "$(curl --location --silent --show-error https://raw.githubusercontent.com/paperless-ngx/paperless-ngx/main/install-paperless-ngx.sh)"

 

실행된 화면

#############################################
###   paperless-ngx docker installation   ###
#############################################

This script will download, configure and start paperless-ngx.

1. Application configuration
============================

The URL paperless will be available at. This is required if the
installation will be accessible via the web, otherwise can be left blank.
Example: https://paperless.example.com

 

URL []:      #패스해도 됨

 

The port on which the paperless webserver will listen for incoming
connections.

 

Port [8000]: 8779       #포트설정

 

Paperless requires you to configure the current time zone correctly.
Otherwise, the dates of your documents may appear off by one day,
depending on where you are on earth.
Example: Europe/Berlin
See here for a list: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones

 

Current time zone [Asia/Seoul]:

 

Database backend: PostgreSQL, MariaDB, and SQLite are available. Use PostgreSQL
if unsure. If you're running on a low-power device such as Raspberry
Pi, use SQLite to save resources.

 

Database backend (postgres sqlite mariadb) [postgres]: sqlite                   #sqlite 로 설정

 

Paperless is able to use Apache Tika to support Office documents such as
Word, Excel, Powerpoint, and Libreoffice equivalents. This feature
requires more resources due to the required services.

 

Enable Apache Tika? (yes no) [no]:

 

Specify the default language that most of your documents are written in.
Use ISO 639-2, (T) variant language codes:
https://www.loc.gov/standards/iso639-2/php/code_list.php
Common values: eng (English) deu (German) nld (Dutch) fra (French)
This can be a combination of multiple languages such as deu+eng

 

OCR language [eng]:

 

Specify the user id and group id you wish to run paperless as.
Paperless will also change ownership on the data, media and consume
folder to the specified values, so it's a good idea to supply the user id
and group id of your unix user account.
If unsure, leave default.

 

User ID [1000]:
Group ID [1003]:

 

2. Folder configuration
=======================

The target folder is used to store the configuration files of
paperless. You can move this folder around after installing paperless.
You will need this folder whenever you want to start, stop, update or
maintain your paperless instance.

 

Target folder [/vo2/docker/paperless/paperless-ngx]:

 

The consume folder is where paperless will search for new documents.
Point this to a folder where your scanner is able to put your scanned
documents.

CAUTION: You must specify an absolute path starting with / or a relative
path starting with ./ here. Examples:
  /mnt/consume
  ./consume

 

Consume folder [/vo2/docker/paperless/paperless-ngx/consume]:

 

The media folder is where paperless stores your documents.
Leave empty and docker will manage this folder for you.
Docker usually stores managed folders in /var/lib/docker/volumes.

CAUTION: If specified, you must specify an absolute path starting with /
or a relative path starting with ./ here.

 

Media folder []:

 

The data folder is where paperless stores other data, such as your
SQLite database, the search index and other data.
As with the media folder, leave empty to have this managed by docker.

CAUTION: If specified, you must specify an absolute path starting with /
or a relative path starting with ./ here.

 

Data folder []:

 

3. Login credentials
====================

Specify initial login credentials. You can change these later.
A mail address is required, however it is not used in paperless. You don't
need to provide an actual mail address.

 

Paperless username [hs7]: admin3
Paperless password:
Paperless password (again):
Email [admin3@localhost]:

 

Summary
=======

Target folder: /vo2/docker/paperless/paperless-ngx
Consume folder: /vo2/docker/paperless/paperless-ngx/consume
Media folder: Managed by docker
Data folder: Managed by docker

URL:
Port: 8779
Database: sqlite
Tika enabled: no
OCR language: eng
User id: 1000
Group id: 1003

Paperless username: admin3
Paperless email: admin3@localhost

Press any key to install.

Installing paperless...

--2024-05-01 16:27:02--  https://raw.githubusercontent.com/paperless-ngx/paperless-ngx/main/docker/compose/docker-compose.sqlite.yml
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1681 (1.6K) [text/plain]
Saving to: ‘docker-compose.yml’

docker-compose.yml  100%[===================>]   1.64K  --.-KB/s    in 0.01s

2024-05-01 16:27:03 (150 KB/s) - ‘docker-compose.yml’ saved [1681/1681]

--2024-05-01 16:27:03--  https://raw.githubusercontent.com/paperless-ngx/paperless-ngx/main/docker/compose/.env
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 31 [text/plain]
Saving to: ‘.env’

.env                100%[===================>]      31  --.-KB/s    in 0s

2024-05-01 16:27:03 (86.8 KB/s) - ‘.env’ saved [31/31]

WARN[0000] /vo2/docker/paperless/paperless-ngx/docker-compose.yml: `version` is obsolete
[+] Pulling 2/2
 ✔ webserver Pulled                                                        1.3s
 ✔ broker Pulled                                                           2.5s
WARN[0000] /vo2/docker/paperless/paperless-ngx/docker-compose.yml: `version` is obsolete
[+] Creating 1/1
 ✔ Container paperless-broker-1  Created                                   2.3s
[+] Running 1/1
 ✔ Container paperless-broker-1  Started                                   1.9s
Paperless-ngx docker container starting...
Mapping UID and GID for paperless:paperless to 1000:1003
usermod: no changes
Creating directory scratch directory /tmp/paperless
Adjusting permissions of paperless files. This may take a while.
Waiting for Redis...
Connected to Redis broker.
Apply database migrations...
Operations to perform:
  Apply all migrations: account, admin, auditlog, auth, authtoken, contenttypes, django_celery_results, documents, guardian, paperless, paperless_mail, sessions, socialaccount
Running migrations:
  No migrations to apply.
Running Django checks
System check identified no issues (0 silenced).
Executing management command createsuperuser --noinput --username admin3 --email admin3@localhost
Superuser created successfully.
WARN[0000] /vo2/docker/paperless/paperless-ngx/docker-compose.yml: `version` is obsolete
[+] Running 2/2
 ✔ Container paperless-broker-1     Runn...                                0.0s
 ✔ Container paperless-webserver-1  S...                                   4

 

 

 

총평

- pdf 파일중 한글로 변환을 못하는 것도 있지만, 높은 확율로 변환을 해줍니다.

- jpg 파일에서 영문이나 한글을 찾지 못하네요.  (OCR 부분은 좀더 살펴봐야 겠습니다.)

 

 

 

 

**추가 (2024.05.03)

- DB를 PostgreSQL (포스트그레스큐엘)을 적용해 봅니다.

- Apache Tika 를 설치해 봅니다.

  이는 여러 문서 파일에서 내용을 추출할때 사용된다고 합니다. (pdf, word, execl, ppt 등)

- gotenberg 도 추가해 봅니다.

  이건 tika와 함께 자동으로 설치되는듯 합니다.

  각종 형태의 문서를 HTML, Markdown, Word, Excel, 등을 PDF 로 변화해 줍니다. 

- OCR 언어를 eng+kor 로 해서 설치해 봅니다.

  이미 설치된 상태에서 추가하려니, 적용이 안되는 듯 해서, 컨테이너를 멈춘 후, 재설치 할때, 적용해 줍니다.

 

결론

- pi4 에서 postresql + tika + gotenberg 를 설치 했는데, 속도는 간략설치와 비슷해서, 쓸만 했습니다.

  다만, 아무런 장점을 발견하지 못했습니다.

- OCR 보강으로 eng+kor 을 설치해 봤는데, 여전히 한글 인식은 못했습니다.

- 따라서, 원상복귀 했습니다. (심플설치, 즉 db는 sqlite 그리고, OCR은 그냥 eng 만 설치)

 

 

 

 

**추가 2024-05-07

- 익스플로에서 접속에러 발생시

Forbidden (403)
CSRF verification failed. Request aborted.

 

More information is available with DEBUG=True.

 

Untitled-5 copy.JPG

 

해결방법

- 포테이너 기준으로 environment 에서 PAPERLESS_URL=https://example.com 를 추가해 줍니다.

  또는 env 파일에 추가해 줍니다.

 

Untitled-4 copy.JPG

 

위사진에서 https://example.com/  로 표시되어 있는데, 이렇게 하면 에러가 발생하네요.

그래서 마지막 슬레쉬를 생략하고, https://example.com  로 하면 됩니다.

 

 

** 추가 (2024.06.14)

- OCR 한글 잘 됩니다. jpg 파일 내용중 한글 인식이 80% 정도 됩니다. 놀라울 따름입니다.

 

Untitled-1 copy.JPG

 

 

 

Untitled-2 copy.JPG

이 게시물을

이 댓글을 삭제하시겠습니까?