Hallo,
ich habe mir letzte Woche einen Synology DS218+ gekauft und mache gerade meine erste Erfahrungen mit SynOCR. Meine Konfiguration sieht wie folgt aus:
synOCR-user: synOCR
synOCR-user is admin: yes
synOCR-version: 1.3.1
Architecture: x86_64
DSM-build: 42962
Device: 218plus (3667225027)
current Profil: eh4
monitor is running?: no
DB-version: 8
used image (created): jbarlow83/ocrmypdf:latest (2023-03-30T05:15:57)
document author:
used ocr-parameter (raw): -srd -l deu+eng
OCR-arg 1: -srd
OCR-arg 2: -l
OCR-arg 3: deu+eng
ocropt_array: -srd -l deu+eng
search prefix:
replace search prefix: yes
renaming syntax: §tag_§tit_ocred
Symbol for tag marking: #
target file handling: useCatDir
Document split pattern: SYNOCR-SEPARATOR-SHEET
split page handling: discard
clean up spaces: false
Date search method: use standard search via RegEx
date found order: firstfound
source for filedate: ocr
ignored dates by search: 2021-02-29;2020-11-31
date range in past: 0 [absolute: 0]
date range in future: 0 [absolute: 0]
Soweit passt es alles. Meine PDFs werden erkannt und abgearbeitet.
Dabei werden aber meine PDF Dateien nach der Vorgabe "§tag_§tit_ocred" NICHT umbenannt.
Ausschnitt aus Logfile:
---------------------------------------------------------------------------------------------------------------
CURRENT FILE: ➜ doc00485520230201154318.pdf
➜ File permissions source file:
-rw-rw-r-- 1 synOCR synOCR 99046 Feb 1 15:43 /volume1/DATEN/_OUTPUT/synOCR_tmp_1680505199/doc00485520230201154318.pdf
-----------------------------------------------------------------------------------
| search tags in ocr text: |
-----------------------------------------------------------------------------------
no tags defined
-----------------------------------------------------------------------------------
| search for a valid date in ocr text: |
-----------------------------------------------------------------------------------
run RegEx date search - search for date format: 1 (1 = dd mm [yy]yy; 2 = [yy]yy mm dd; 3 = mm dd [yy]yy)
Dates found: 2
check date (dd mm [yy]yy): 24.12.2022
➜ valid
day: 24
month:12
year: 2022
-----------------------------------------------------------------------------------
| rename and sort to target folder: |
-----------------------------------------------------------------------------------
[runtime up to now: 00:00:01]
➜ renaming:
apply renaming syntax ➜ _doc00485520230201154318_ocred
[runtime up to now: 00:00:01]
➜ insert metadata (use python PyPDF2)
used metadata:
➜ '/Author': '',
➜ '/Keywords': '',
➜ '/CreationDate': 'D:20221224'
[runtime up to now: 00:00:01]
target file: _doc00485520230201154318_ocred.pdf
CURRENT FILE: ➜ doc00485520230201154318.pdf
➜ File permissions source file:
-rw-rw-r-- 1 synOCR synOCR 99046 Feb 1 15:43 /volume1/DATEN/_OUTPUT/synOCR_tmp_1680505199/doc00485520230201154318.pdf
-----------------------------------------------------------------------------------
| search tags in ocr text: |
-----------------------------------------------------------------------------------
no tags defined
-----------------------------------------------------------------------------------
| search for a valid date in ocr text: |
-----------------------------------------------------------------------------------
run RegEx date search - search for date format: 1 (1 = dd mm [yy]yy; 2 = [yy]yy mm dd; 3 = mm dd [yy]yy)
Dates found: 2
check date (dd mm [yy]yy): 24.12.2022
➜ valid
day: 24
month:12
year: 2022
-----------------------------------------------------------------------------------
| rename and sort to target folder: |
-----------------------------------------------------------------------------------
[runtime up to now: 00:00:01]
➜ renaming:
apply renaming syntax ➜ _doc00485520230201154318_ocred
[runtime up to now: 00:00:01]
➜ insert metadata (use python PyPDF2)
used metadata:
➜ '/Author': '',
➜ '/Keywords': '',
➜ '/CreationDate': 'D:20221224'
[runtime up to now: 00:00:01]
target file: _doc00485520230201154318_ocred.pdf
---------------------------------------------------------------------------------------------------------------
Was ist hier falsch? Warum ist der Zieldateiname: _doc00485520230201154318_ocred.pdf?
Meine Erwartung ist als Zieldateiname z.B. "ein aus Datei ausgelesenes Wort"__doc00485520230201154318_ocred.pdf".
Herzlichen Dank für Eure Unterstützung im voraus.
ich habe mir letzte Woche einen Synology DS218+ gekauft und mache gerade meine erste Erfahrungen mit SynOCR. Meine Konfiguration sieht wie folgt aus:
synOCR-user: synOCR
synOCR-user is admin: yes
synOCR-version: 1.3.1
Architecture: x86_64
DSM-build: 42962
Device: 218plus (3667225027)
current Profil: eh4
monitor is running?: no
DB-version: 8
used image (created): jbarlow83/ocrmypdf:latest (2023-03-30T05:15:57)
document author:
used ocr-parameter (raw): -srd -l deu+eng
OCR-arg 1: -srd
OCR-arg 2: -l
OCR-arg 3: deu+eng
ocropt_array: -srd -l deu+eng
search prefix:
replace search prefix: yes
renaming syntax: §tag_§tit_ocred
Symbol for tag marking: #
target file handling: useCatDir
Document split pattern: SYNOCR-SEPARATOR-SHEET
split page handling: discard
clean up spaces: false
Date search method: use standard search via RegEx
date found order: firstfound
source for filedate: ocr
ignored dates by search: 2021-02-29;2020-11-31
date range in past: 0 [absolute: 0]
date range in future: 0 [absolute: 0]
Soweit passt es alles. Meine PDFs werden erkannt und abgearbeitet.
Dabei werden aber meine PDF Dateien nach der Vorgabe "§tag_§tit_ocred" NICHT umbenannt.
Ausschnitt aus Logfile:
---------------------------------------------------------------------------------------------------------------
CURRENT FILE: ➜ doc00485520230201154318.pdf
➜ File permissions source file:
-rw-rw-r-- 1 synOCR synOCR 99046 Feb 1 15:43 /volume1/DATEN/_OUTPUT/synOCR_tmp_1680505199/doc00485520230201154318.pdf
-----------------------------------------------------------------------------------
| search tags in ocr text: |
-----------------------------------------------------------------------------------
no tags defined
-----------------------------------------------------------------------------------
| search for a valid date in ocr text: |
-----------------------------------------------------------------------------------
run RegEx date search - search for date format: 1 (1 = dd mm [yy]yy; 2 = [yy]yy mm dd; 3 = mm dd [yy]yy)
Dates found: 2
check date (dd mm [yy]yy): 24.12.2022
➜ valid
day: 24
month:12
year: 2022
-----------------------------------------------------------------------------------
| rename and sort to target folder: |
-----------------------------------------------------------------------------------
[runtime up to now: 00:00:01]
➜ renaming:
apply renaming syntax ➜ _doc00485520230201154318_ocred
[runtime up to now: 00:00:01]
➜ insert metadata (use python PyPDF2)
used metadata:
➜ '/Author': '',
➜ '/Keywords': '',
➜ '/CreationDate': 'D:20221224'
[runtime up to now: 00:00:01]
target file: _doc00485520230201154318_ocred.pdf
CURRENT FILE: ➜ doc00485520230201154318.pdf
➜ File permissions source file:
-rw-rw-r-- 1 synOCR synOCR 99046 Feb 1 15:43 /volume1/DATEN/_OUTPUT/synOCR_tmp_1680505199/doc00485520230201154318.pdf
-----------------------------------------------------------------------------------
| search tags in ocr text: |
-----------------------------------------------------------------------------------
no tags defined
-----------------------------------------------------------------------------------
| search for a valid date in ocr text: |
-----------------------------------------------------------------------------------
run RegEx date search - search for date format: 1 (1 = dd mm [yy]yy; 2 = [yy]yy mm dd; 3 = mm dd [yy]yy)
Dates found: 2
check date (dd mm [yy]yy): 24.12.2022
➜ valid
day: 24
month:12
year: 2022
-----------------------------------------------------------------------------------
| rename and sort to target folder: |
-----------------------------------------------------------------------------------
[runtime up to now: 00:00:01]
➜ renaming:
apply renaming syntax ➜ _doc00485520230201154318_ocred
[runtime up to now: 00:00:01]
➜ insert metadata (use python PyPDF2)
used metadata:
➜ '/Author': '',
➜ '/Keywords': '',
➜ '/CreationDate': 'D:20221224'
[runtime up to now: 00:00:01]
target file: _doc00485520230201154318_ocred.pdf
---------------------------------------------------------------------------------------------------------------
Was ist hier falsch? Warum ist der Zieldateiname: _doc00485520230201154318_ocred.pdf?
Meine Erwartung ist als Zieldateiname z.B. "ein aus Datei ausgelesenes Wort"__doc00485520230201154318_ocred.pdf".
Herzlichen Dank für Eure Unterstützung im voraus.

