Did you ever saw a char and thought: “Damn, 1 byte for a single char is pretty darn inefficient”? No? Well I did. So what I decided to do instead is to pack 5 chars, convert each char to a 2 digit integer and then concat those 5 2 digit ints together into one big unsigned int and boom, I saved 5 chars using only 4 instead of 5 bytes. The reason this works is, because one unsigned int is a ten digit long number and so I can save one char using 2 digits. In theory you could save 32 different chars using this technique (the first two digits of an unsigned int are 42 and if you dont want to account for a possible 0 in the beginning you end up with 32 chars). If you would decide to use all 10 digits you could save exactly 3 chars. Why should anyone do that? Idk. Is it way to much work to be useful? Yes. Was it funny? Yes.

Anyone whos interested in the code: Heres how I did it in C: https://pastebin.com/hDeHijX6

Yes I know, the code is probably bad, but I do not care. It was just a funny useless idea I had.

  • ChaoticNeutralCzech@feddit.org
    link
    fedilink
    English
    arrow-up
    10
    ·
    1 month ago
    unsigned int turn_char_to_int(char pChar)
    {
        switch(pChar)
        {
        case 'a':
            return 10;
        case 'b':
            return 11;
        case 'c':
            return 12;
        case 'd':
            return 13;
        case 'e':
            return 14;
        case 'f':
            return 15;
        case 'g':
            return 16;
        case 'h':
            return 17;
        case 'i':
            return 18;
        case 'j':
            return 19;
        case 'k':
            return 20;
        case 'l':
            return 21;
        case 'm':
            return 22;
        case 'n':
            return 23;
        case 'o':
            return 24;
        case 'p':
            return 25;
        case 'q':
            return 26;
        case 'r':
            return 27;
        case 's':
            return 28;
        case 't':
            return 29;
        case 'u':
            return 30;
        case 'v':
            return 31;
        case 'w':
            return 32;
        case 'x':
            return 33;
        case 'y':
            return 34;
        case 'z':
            return 35;
        case ' ':
            return 36;
        case '.':
            return 37;
    
        }
    }
    

    Are you a monster or just stupid?

      • ChaoticNeutralCzech@feddit.org
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        1 month ago

        If you couldn’t write

        if(pChar >= 'a' && pChar <= 'z') return pChar - ('a' - 10);
        

        I suppose you typed this “all the size of a lookup table with none of the speed” abomination manually too.

        • Zacryon@feddit.org
          link
          fedilink
          arrow-up
          6
          ·
          1 month ago

          switch case structures are very efficient in c and c++. They work similarly like an offset into memory. Compute the offset once (any expression in the ‘case’ lines), then jump. Using primitives directly, like here with chars, is directly the offset. Contrary to if-else branches, where each case must be evaluated first and the CPU has basically no-op cycles in the pipeline until the result of the branch is known. If it fails, it proceeds with the next one, waits again etc… (Some CPU architectures might have stuff like speculative branch execution, which can speed this up.)

          However, code-style wise this is really not elegant and something like your proposal or similar would be much better.

          • ChaoticNeutralCzech@feddit.org
            link
            fedilink
            English
            arrow-up
            3
            ·
            edit-2
            1 month ago

            Oh, I didn’t know that they were a LUT of jump addresses. Stil, a LUT of values would be more space-efficient and likely faster. Also, what if the values are big and sparse, e.g.

            switch (banknoteValue) {
                case 5000:
                    check_uv();
                    check_holograph();
                case 2000:
                    check_stripe();
                case 1000:
                    check_watermark();
            }
            

            …does the compiler make it into an if-else-like machine code instead?

            • Zacryon@feddit.org
              link
              fedilink
              arrow-up
              1
              ·
              edit-2
              3 hours ago

              Good question! I have read a bit more about it and this does indeed heavily depend on the respective compiler implementation. Depending on the compiler, it may prefer default if-else ladders for few cases. For more, dense cases, LUTs may play a larger role. For less dense cases binary search might be chosen.

              I inspected the generated assembler code of your (slightly extended) example via https://godbolt.org/

              The code I used:

              void check_uv();
              void check_holograph();
              void check_stripe();
              void check_watermark();
              
              void switch_test(int banknoteValue) {
              
                  switch (banknoteValue) {
                      case 5000:
                          check_uv();
                          check_holograph();
                      case 2000:
                          check_stripe();
                      case 1000:
                          check_watermark();
                  }
              
              }
              

              Using x86-64 gcc 15.2 this leads to a couple of cmp followed by je instructions, so “compare” and “jump to label if equal” which basically is a typical if-else ladder. I get the same for x64 msvc v19.43.
              Changing the cases to 1, 2 and 3 instead of 5000, 2000 and 1000 does not change the outcome.

              Increasing to 23 different but incrementally increasing cases (cases 1 to 23) does not change the outcome as well for gcc. But here msvc has introduced a performance optimization: it decreased the input value by one to get a range of 0 to 22 and then created a jump table, so a LUT with execution addresses. (I am not going to detail the assembler commands logic here, but you can use the C++ code below and take a look yourself. :) )

              So even in this simple example we can already see how different compilers may implement switch cases differently depending on its structure. Even though gcc chose the apparently less efficient solution here, usually one may trust on the compiler choosing the most efficient switch implementation. ;)

              As far as I know, we would not even get the chance of similar optimizations if choosing if-else ladders directly instead of a switch-case structure. It would be interesting to put this to a test though and see whether some compilers translate if-else ladders equivalently with the performance benefits that can currently come with switch structures.

              The inflated code:

              void check_uv();
              void check_holograph();
              void check_stripe();
              void check_watermark();
              
              void switch_test(int banknoteValue) {
              
                  switch (banknoteValue) {
                      case 1:
                          check_uv();
                          check_holograph();
                      case 2:
                          check_stripe();
                      case 3:
                          check_watermark();
                      case 4:
                          check_watermark();
                      case 5:
                          check_watermark();
                      case 6:
                          check_watermark();
                      case 7:
                          check_watermark();
                      case 8:
                          check_watermark();
                      case 9:
                          check_watermark();
                      case 10:
                          check_watermark();
                      case 11:
                          check_watermark();
                      case 12:
                          check_watermark();
                      case 13:
                          check_watermark();
                      case 14:
                          check_watermark();
                      case 15:
                          check_watermark();
                      case 16:
                          check_watermark();
                      case 17:
                          check_watermark();
                      case 18:
                          check_watermark();
                      case 19:
                          check_watermark();
                      case 20:
                          check_watermark();
                      case 21:
                          check_watermark();
                      case 22:
                          check_watermark();
                      case 23:
                          check_watermark();
                  }
              
              }