Situation: I got a scanned book that I’d like to read that is in chinese and has no available translation. I really want to read it, because it would probably help a lot with my university project.

What I tried: tried creating a version with ocr to get a text layer and use some translation tool on it, but found no way to make the ocr text visible. I also tried this tool, but the ocr didn’t work for me, and I found no way to use it with some local model

Have any of you ever done a similar task? I’d appreciate any kind of suggestions and tips.

  • morto@piefed.socialOP
    link
    fedilink
    English
    arrow-up
    3
    ·
    14 hours ago

    I used tesseract, but the output pdf didn’t have visible text, and I found no way to change it. Maybe I don’t know how to properly use it., or it’s not intended to keep formatting.

    • bitofarambler@crazypeople.online
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      11 hours ago

      try gImagereader.

      it’s a frontend to tesseract and is more workable via its GUI and option menus.

      Load the file, execute the program.

      That’s all I had to do for a successful OCR.