Given a compressed string, give the number of ways it can be decompressed.

  • A compressed string is a string where substrings of repeated characters are changed to a format similar to aaa -> a3
  • For example for the string aaabbhhhh33aaa the compressed string would be a3b2h432a3

As an example of counting decompressing options. Say you have the input a333b2. This can be read as a333 b2 or a3 33 b2 so you would need to return the value 2 as there are 2 options.


You must accept the input as a command line argument (entered when your app is ran) and print out the result

(It will be called like node main.js aaaaa or however else to run apps in your language)

You can use the solution tester in this post to test you followed the correct format https://programming.dev/post/1805174


Checking through the solutions of last weeks problems should be happening this weekend monday + tuesday. Been getting too much work and theres a game jam thats been going on but should be able to go through them all then. I may do the easy one last since it cant be checked using the solutions tester so that one will take a bit longer.

Will also try to go through and give solutions to these ones as well on the weekend and will try to do solutions as soon as possible from now on so you can see if you failed the test cases. For the ones done last week, after I go through them all ill give 1 extra day from then on for any resubmissions

  • shape-warrior-t
    link
    fedilink
    210 months ago

    Thanks for the update on checking through solutions, and thanks in general for all the work you’ve put into this community!

    Would just like to clarify: what are the valid decompressed strings? For an input of a333a3, should we return 2 (either a333 a3 or a3 33 a3) or 1 (since a333 a3 isn’t a possible compression – it would be a336 instead)? Do we have to handle cases like a00010, and if so, how?

    • AtegonOPMA
      link
      2
      edit-2
      10 months ago

      The first one would give 2 since it doesnt validate what its supposed to have been when it was compressed

      For a00010 that one would be a0 00 10 or a000 10 or a0 0010 or a00 010 or a00010 for 5 since it doesnt validate

      • shape-warrior-t
        link
        fedilink
        110 months ago

        So every list of strings, where each string is some character followed by one or more digits, is a distinct, valid decompressing option. Thanks for clarifying!

  • shape-warrior-t
    link
    fedilink
    210 months ago

    Given an input c, outputs the number of distinct lists of strings lst such that:

    1. ''.join(lst) == c
    2. for every string s in lst, s consists of an arbitrary character followed by one or more characters from ‘0123456789’

    Sure hope I didn’t mess this up, because I think the fundamental idea is quite elegant! Should run successfully for all “reasonable” inputs (as in, the numeric output fits in a uint64 and the input isn’t ridiculously long). Fundamental algorithm is O(n) if you assume all arithmetic is O(1). (If not, then I don’t know what the time complexity is, and I don’t feel like working it out.)

    from functools import cache
    from itertools import pairwise
    from math import prod
    
    @cache
    def fibonacci(n: int) -> int:
        if n == 0:
            return 0
        if n == 1:
            return 1
        return fibonacci(n - 1) + fibonacci(n - 2)
    
    def main(compressed: str) -> int:
        is_fragment_start = [i == 0 or c not in '0123456789' for i, c in enumerate(compressed)]
        fragment_start_positions = [i for i, s in enumerate(is_fragment_start) if s]
        fragment_lengths = [stop - start for start, stop in pairwise(fragment_start_positions + [len(compressed)])]
        return prod(fibonacci(fragment_length - 1) for fragment_length in fragment_lengths)
    
    if __name__ == '__main__':
        from argparse import ArgumentParser
        parser = ArgumentParser()
        parser.add_argument('compressed')
        print(main(parser.parse_args().compressed))
    
    

    Idea: a00010 -> [a000, 10] -> [length 4, length 2] -> F(4) * F(2)
    01a102b0305 -> [01, a102, b0305] -> [length 2, length 4, length 5] -> F(2) * F(4) * F(5)
    where F(n) = fibonacci(n - 1) is the number of ways to partition a string of length n into a list of strings of length ≥2.

    F(2) = 1 = fibonacci(1), F(3) = 1 = fibonacci(2), and F(n) = F(n - 2) + F(n - 1), so F is indeed just an offset version of the Fibonacci sequence.
    To see why F(n) = F(n - 2) + F(n - 1), here are the ways to split up ‘abcde’: [‘ab’] + (split up ‘cde’), [‘abc’] + (split up ‘de’), and [‘abcde’], corresponding to F(5) = F(3) + F(2) + 1.
    And the ways to split up ‘abcdef’: [‘ab’] + (split up ‘cdef’), [‘abc’] + (split up ‘def’), [‘abcd’] + (split up ‘ef’), and [‘abcdef’], corresponding to F(6) = F(4) + F(3) + F(2) + 1 = F(4) + F(5) = F(6 - 2) + F(6 - 1).
    The same logic generalizes to all n >= 4.

    • @Quasari
      link
      English
      210 months ago

      Without memoization, I believe the Fibonacci sequence is O(2^N). It’s dependent on how long a sequence of digits is in the input, so your worst case is like O(2^N) if the string is mostly digits and best case being O(N) if there are only 1 or 2 digit sequences.

      Someone can correct me if I’m wrong.

      • shape-warrior-t
        link
        fedilink
        110 months ago

        My implementation is memoized by functools.cache, but that is a concern when it comes to recursive Fibonacci. That, and stack overflows, which are also a problem for my code (but, again, not for “reasonable” inputs – fibonacci(94) already exceeds 2^64).

        Time complexity-wise, I was more thinking about the case where the numbers get so big that addition, multiplication, etc. can no longer be modelled as taking constant time. Especially if math.prod and enumerate are implemented in ways that are less efficient for huge integers (I haven’t thoroughly checked, and I’m not planning to).

        • @Quasari
          link
          English
          210 months ago

          functools.cache

          That’s pretty cool. I haven’t dived too deep into python, so I should of looked up the library when you attached the @cache decorator. I learned something.

  • @Quasari
    link
    English
    2
    edit-2
    10 months ago

    So, I also realized its the fibonacci sequence for number of combinations of numbers. All I care about is counting sequential digits, we skip a character after every sequence. We just need to account for if the first character is a number and we are good. I did this in ruby. I didn’t really try for anything bonus points wise.

    Pastebin because formatting doesn’t like & or <

    My code also outputs 0 if the sequence itself is invalid(ends on a non-digit character, or has two non-digit characters in a row)

  • brie
    link
    fedilink
    110 months ago

    Very late, but C solution using fib: Git

    #include 
    #include 
    #include 
    
    size_t fib(size_t n)
    {
    	/* n=93 is the last fib(n) that fits in u64. */
    	static size_t cache[94] = { 0, 1 };
    	static size_t top_valid = 1;
    	size_t a, b;
    
    	if (top_valid >= n) {
    		return cache[n];
    	} else if (n >= sizeof(cache) / sizeof(*cache)) {
    		fputs("Fib overflow\n", stderr);
    		exit(EXIT_FAILURE);
    	}
    
    	a = cache[top_valid - 1];
    	b = cache[top_valid];
    	for (; top_valid < n; top_valid += 2) {
    		a = cache[top_valid + 1] = a + b;
    		b = cache[top_valid + 2] = a + b;
    	}
    
    	return cache[n];
    }
    
    int main(int argc, char **argv)
    {
    	char *p, c;
    	size_t combos = 1;
    	int digits = 0;
    	
    
    	if (argc != 2) {
    		fputs("Bad invocation.\n", stderr);
    	}
    
    	if (!*(p = argv[1])) {
    		goto done;
    	}
    
    	for (;;) {
    		c = *++p;
    		if (c < '0' || c > '9') {
    			combos *= fib(digits);
    			digits = 0;
    			if (!c) {
    				break;
    			}
    		} else {
    			digits++;
    		}
    	}
    
    done:
    	printf("%zu\n", combos);
    
    	return 0;
    }
    

    Zig would probably be a good language for a solution if there is a known maximum number of combinations, since it has large fixed-width types and comptime.