Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
I'm trying to figure out a way to do short division (ie. dividing a multi-digit value with a single-digit value) when all you have available is single-digit arithmetic (and bitwise) operations. Short division is quite simple if you could divide (and get a remainder of) a 2-digit value by a 1-digit value, but it's not as trivial if all you have is single-digit division. Essentially, one would need a way to divide a 2-digit value by a single digit value, using only single-digit arithmetic. For short division in particular the constraint can be safely assumed that the most-significant digit of the dividend is always smaller than the divisor (and therefore the quotient and remainder are both always single-digit). Preferably the solution would not involve looping and searching (because in this case we are not talking about digits in the range 0, 9, but digits in the range 0, 264-1. Even a 64-iteration binary-search loop would be a bit too much.)
Player (98)
Joined: 12/12/2013
Posts: 380
Location: Russia
Warp wrote:
I'm trying to figure out a way to do short division (ie. dividing a multi-digit value with a single-digit value) when all you have available is single-digit arithmetic (and bitwise) operations.
You can look at multi digit value as bunch of half-digits. For example, if digit is byte, then you can look at nibbles, and divide two nibbles by single nibble. Otherwise you can always make 2digit division by single digit by binary search.
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
Problem is, if we halve the size of the digits (eg. consider the operands to consist of 32-bit values instead of 64-bit values) now it's not a short division anymore, because it would be dividing a multi-digit value by a 2-digit value (because now the divisor is being interpreted as having the size of two 32-bit values). While now we do have a 64-bit/32-bit division operation available, I'm not sure this becomes any easier, because the divisor is not single-digit anymore. Or is there an easy "short division" algorithm that works for a two-digit divisor?
Player (98)
Joined: 12/12/2013
Posts: 380
Location: Russia
Warp wrote:
I'm trying to figure out a way to do short division (ie. dividing a multi-digit value with a single-digit value) when all you have available is single-digit arithmetic (and bitwise) operations.
I'm interpreting your task as: we can do N bits by N division / addition / subtraction / multiplication. Also multiplication of N bits by N gives N bits. (otherwise why you can't divide 2N bits by N?) There are several ways, depending on what you like and dislike. First, as I said: split N bits into N/2 bits and do long division, because you can now divide (N/2)*2 by (N/2) which is required for long division. Also, long division require long multiplication which require digit by digit multiplication giving 2digit, i.e. N/2 by N/2 giving N <- this we also have. Other way is to use binary search to implement 2N by N division, and use basic short division algorithm. Now, what I wasn't proposed before, but you can do it as well: walk bit by bit and subtract. It doesn't require division at all. Only subtraction and bitwise operation. Example, 4 byte by byte division: https://onlinegdb.com/HklFPaHc7w Input: two numbers separated by space. You can also uncomment commented code to see debugging output. You can use any algorithm that is used in CPU for making instructions. I mean those algorithms that is used to make CPU able to do division by single instruction. But if you want to take advantage of having division instruction already, then... above is incomplete list of what you can do.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
Evaluate:
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
r57shell wrote:
I'm interpreting your task as: we can do N bits by N division / addition / subtraction / multiplication. Also multiplication of N bits by N gives N bits.
If it helps, it is possible to get the full 128-bit result of a 64-bit multiplication in this case. (The architecture in question is aarch64.)
(otherwise why you can't divide 2N bits by N?)
The aarch64 instruction set has a way to get the full 128-bit result of a multiplication, but does not have support for dividing a 128-bit value by a 64-bit one.
First, as I said: split N bits into N/2 bits and do long division, because you can now divide (N/2)*2 by (N/2) which is required for long division.
I wish I knew how to implement long division, but I don't. (The grade school "long division" algorithm is pretty useless because it's hilariously circular and completely unhelpful. It essentially says "to calculate A/B, find the largest C for which B*C<=A"... which is the very definition of A/B. In other words, it's just saying "to calculate A/B, calculate A/B". Gee, thanks.)
Other way is to use binary search to implement 2N by N division, and use basic short division algorithm.
One would think that when a fast N by N division is available, one wouldn't have to resort to a complicated 64-step binary search in order to calculate a 2N by N division. There has to be a way to take advantage of that N/N operation to make it faster.
Player (98)
Joined: 12/12/2013
Posts: 380
Location: Russia
p4wn3r wrote:
Evaluate:
Spoiler allert: my solution is long and ugly. Try to find your own. Had to spend several hours. http://mathb.in/44788 It's funny, it's wrong :D Need to investigate. Ah, no, it's right. My code had mistake, I had n^2-k. Yep, I went to check solution when posted. Wasn't doing any code before :b
Warp wrote:
If it helps, it is possible to get the full 128-bit result of a 64-bit multiplication in this case. (The architecture in question is aarch64.)
It surely helps.
Warp wrote:
The aarch64 instruction set has a way to get the full 128-bit result of a multiplication, but does not have support for dividing a 128-bit value by a 64-bit one.
First, as I said: split N bits into N/2 bits and do long division, because you can now divide (N/2)*2 by (N/2) which is required for long division.
I wish I knew how to implement long division, but I don't.
I described it here and also gave an example. You could ask back then what you don't understand. I'll look into that tomorrow. I'm exhausted by p4wn3r's task. Spoiler: long division will need approximately len_bits/32 long multiplications. Comparing to binary search that would take len_bits.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
r57shell wrote:
p4wn3r wrote:
Evaluate:
Spoiler allert: my solution is long and ugly. Try to find your own. Had to spend several hours. http://mathb.in/44788
It looks way too convoluted. The problem is pretty easy if you use asymptotic expansions
Player (98)
Joined: 12/12/2013
Posts: 380
Location: Russia
p4wn3r wrote:
r57shell wrote:
p4wn3r wrote:
Evaluate:
Spoiler allert: my solution is long and ugly. Try to find your own. Had to spend several hours. http://mathb.in/44788
It looks way too convoluted. The problem is pretty easy if you use asymptotic expansions
All asymptotes I can think about giving product of 1 or sum of 0 which obviously wrong. Looking forward for elegant solution.
Player (36)
Joined: 9/11/2004
Posts: 2631
I found that using the Stirling approximation for the gamma function got an expression that converged to the correct value, according to Mathematica at least. But I can't figure out how to take the limit. The Laurent Series expansion is 1/2 + O(1/n) at infinity, but again, I can't seem to do it by hand. Essentially, you can completely eliminate the product and k from this expression by using a quotient of factorials. Then you can change the factorials into the Gamma function and use the Stirling approximation, I also took the logarithm of everything in an attempt to make the math nicer. You get: 1 - n - (2n-2) ln(n) + (n^2 + n - 1/2) ln(n^2 + n) - (n^2 - 1/2) ln(n^2 + 1) + O(1/n) after doing this. Which is a bit of a mess. I'm probably missing something super easy.
Build a man a fire, warm him for a day, Set a man on fire, warm him for the rest of his life.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
I have two similar routes to solving it. The first is to notice that (1 + k/n²) ~ ek/n² when n goes to infinity. There are two ways to notice this. The first one is to realize that (1+k/n²)^n² converges to e. So, by taking the 1/n² power of both sides we get the result. The other way is to simply take the first order Taylor expansion of ek/n² to see that it's indeed (1 + k/n²). This way, we reduce the product to prod ek/n² for k from 1 to n-1. This is just en(n-1)/2n², which tends to sqrt(e) as n goes to infinity. Yet another way is to take logs. If we call the limit L, we have ln L = sum_k ln(1+k/n²) We can expand ln(1+k/n²) = k/n² + o(1/n²) The error terms are all o(1/n²) because k<n. Now, if we sum n terms, the o(1/n²) terms become o(1/n), and we have ln L = sum(k)/n² + o(1/n) = n(n-1)/2n² + o(1/n) Since the o(1/n) terms go to 0 we have ln L = 1/2, and L = sqrt(e)
Player (98)
Joined: 12/12/2013
Posts: 380
Location: Russia
p4wn3r wrote:
We can expand ln(1+k/n²) = k/n² + o(1/n²)
I see now. Nice. I'm not strong enough in calculus to say that it's rigorous to me. I have my own doubts in asymptotes.
Warp wrote:
r57shell wrote:
I'm interpreting your task as: we can do N bits by N division / addition / subtraction / multiplication. Also multiplication of N bits by N gives N bits.
If it helps, it is possible to get the full 128-bit result of a 64-bit multiplication in this case. (The architecture in question is aarch64.)
Here is 128 by 64 division for aarch64. highest bit of b should be set! This is what I mean by aligned. This is essential of long division. If it's not set then shift both operands until it's set.
Language: cpp

typedef unsigned __int128 u128; typedef unsigned long u64; void div128_64_aligned(u128 a, u64 b, u128 *res, u64 *rem) { u64 high = (a>>64); u64 res_high = high/(b>>32); u128 tmp = (u128(res_high)*u128(b)); u128 left = (a>>32); while (tmp > left) { res_high -= 1; tmp -= b; } left = a - (tmp<<32); high = (left>>32); u64 res_low = high/(b>>32); tmp = (u128(res_low)*u128(b)); while (tmp > left) { res_low -= 1; tmp -= b; } left = left - tmp; if (rem) *rem = left; if (res) *res = res_low + (u128(res_high)<<32); }
You can check resulting asm on godbolt.org. Use -O2 flag and GCC or Clang. If it's not enough optimized: optimize better yourself. This code above is basically long-division with taking into account fixed size of operands. I actually implemented n to n addition, n to 1 multiplication and only then realized that you may use this and then use short-division algorithm using this code. If you need actual division, it would require for me to also make n by m multiplication and n by 1 division and only then n by m division. Which is a lot of work. And code above is already time investment. You can verify code trying same algorithm for 64 by 32 division by reducing all operands in size and all bit shifts accordingly. Regarding to while loops: each of them may run up to 4 times. Corner test is 0xfffffffffffffffb0000000000000000 divided by 0x8fffffffffffffff. If alignment (making highest bit set) is bottleneck, you may try to implement version without preceding shift. Also, why not use libraries made for this? mpmath, gmp?
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
r57shell wrote:
Here is 128 by 64 division for aarch64. highest bit of b should be set! This is what I mean by aligned. This is essential of long division. If it's not set then shift both operands until it's set.
I notice that if the inputs are shifted to the left until the highest bit of b is set, then the returned remainder also needs to be shifted to the right by that same amount. I must admit that without understanding the math behind it, the code is almost incomprehensible to me. Even after reading it for like an hour I cannot understand what it's doing. (It's also rather appalling how astonishingly complicated multiprecision division is, compared to all the other arithmetic operations. Or even short division, if a double-sized-register-divided-by-register operation is not available. Understanding even something like the Karatsuba multiplication algorithm is peanuts compared to this. It's inscrutable.) I might settle for a short division with a limit of 32 bits for the divisor. At least that's trivial to implement.
Player (98)
Joined: 12/12/2013
Posts: 380
Location: Russia
Warp wrote:
I must admit that without understanding the math behind it, the code is almost incomprehensible to me.
I'll try explain it once again. Idea is simple. As I said it's long division. so instead of AB/C (where A, B, C is digits) we calculate: ABCD/EF. Key insight that it's approximately AB00/E0. So first division is exactly AB00/E0 which is if you reduce, you'll get (AB/E)*10. Lets call it res_high. We know that ABCD/EF is approximately res_high. What we do? We fix in the way that real ABCD/EF >= res_high. Multiply both sides by EF you'll get ABCD >= res_high*EF. This is first while loop. It just comparing without multiplying by 10 to avoid overflow. Because res_high can be 2 digits already, and if you multiply by 10 you'll get overflow. Now we know that ABCD/EF >= res_high. Subtract! Get remainder! variable left = ABCD - EF*res_high. Now, again using same trick. left / EF is approximately left / E0. Second division is exactly res_low = left / E0. We want to fix it to be left / EF >= res_low. It's same as left >= res_low * EF. This is second while loop. Now we have remainder that is less than EF, this means that our res_high*10+res_low is quotient, and "left" is remainder. Key thing is this alignment. It guarantees that while loops will do at most 4 iterations. And it's very unlikely. Also, this approximation never underestimate. So if you're off, you need to subtract. You never need to add. Long division in arbitrary case is same it just use long multiplication to do fix. So, summary long division is following: 1) make approximation by division 2 digits by 1 digit. You'll find out as many bits as you have in single digit. Mistake is up to 4 (units not bits!). 2) make long multiplication. 3) do up to four subtractions to find out precise result of division. 4) now you have remainder, and you have determined single digit of quotient. 5) repeat process with remainder (it's at least one digit shorter). Also, I can prove that while loops take up to four iterations if you want.
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
Matt Parker in his latest video points out how, perhaps a bit surprisingly, there is no closed-form expression for the perimeter of an ellipse (at least not for the general case). That got me thinking: Are there ellipses (other than the circle) where the major radius, the minor radius and the perimeter are all expressible with closed-form expressions? (In this context "closed-form expression" excludes integrals.)
Editor, Expert player (2079)
Joined: 6/15/2005
Posts: 3282
Warp wrote:
That got me thinking: Are there ellipses (other than the circle) where the major radius, the minor radius and the perimeter are all expressible with closed-form expressions? (In this context "closed-form expression" excludes integrals.)
I assume by "closed-form expression" you are also excluding infinite sums. Other than the circle and the degenerate ellipse (minor radius = 0), I don't think it is possible. However there are ways to approximate perimeter of an ellipse using infinite sums.
Player (36)
Joined: 9/11/2004
Posts: 2631
There is a "closed-form" expression. But it uses the elliptic E function. Circumference of an ellipse with semi major axes a and b is: 4 b E(sqrt(1 - a^2/b^2)) or equivalently, if sqrt(1 - a^2/b^2) is complex you can use 4 a E(sqrt(1 - b^2/a^2)) E is defined on the complex plane though, so it doesn't really care about complex arguments. The derivation of this is, given the parametric form of the ellipse: x(t) = a cos(t) y(t) = b sin(t) It is possible to compute the arc length of the ellipse as: 4 * integral(sqrt(x'(t)^2 + y'(t)^2) dt, t = 0 to pi/2) Where the integral calculates the arc length along 1/4 of the ellipse, and then the value is multiplied by 4 to get the total value (due to symmetry). x'(t)^2 = a^2 sin^2(t) y'(t)^2 = b^2 cos^2(t) So our integrand simplifies to (after some massaging): b sqrt(1 - (1 - a^2/b^2) sin^2(t)) This is also in the form that the definition of the elliptic E function wants: E(k) = integral(sqrt(1 - k^2 sin^2(t)) dt, t = 0 to pi/2) So in our case we have k^2 = 1 - a^2/b^2, so we get the formula stated at the top. Some numeric packages work in terms of k^2 instead of k. For instance, Mathematica's EllipticE function evaluates E(k^2) not E(k). On the other hand Maple follows the definition given the NIST digital library of special functions..
Build a man a fire, warm him for a day, Set a man on fire, warm him for the rest of his life.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
Warp wrote:
Matt Parker in his latest video points out how, perhaps a bit surprisingly, there is no closed-form expression for the perimeter of an ellipse (at least not for the general case). That got me thinking: Are there ellipses (other than the circle) where the major radius, the minor radius and the perimeter are all expressible with closed-form expressions? (In this context "closed-form expression" excludes integrals.)
I would be careful to say "closed form expression" because this term can be a bit biased. For example, if you solve a problem and find sin(x) as an answer, most people would consider this a closed-form expression, until you actually go calculate it. How do you calculate sin(x) in general? You need to have an infinite series, integral representation, or the limit of an expression to do it, just like you need for more complicated expressions. When people usually say a problem has a "closed-form" solution, what they mean by that is that the general answer has so many interesting algebraic properties that makes it very simple to calculate, and in the example you provided, the answer is given by elliptic functions, which satisfy many "nice" properties like trigonometric functions do, so I would consider it as a closed form expression. By "nice" I mean more precisely that they are closely linked to modular forms, which can solve problems in many distinct areas of math. A lot of fancy research mathematics is just fancy ways to manipulate modular forms to show that they have all the structure needed to describe another algebraic structure. If I take your question to mean, for which ellipses the perimeter is an algebraic number, the answer was found at the end of the 19-th century, and beginning of the 20-th. They happen at singular moduli. This term is very old and you're unlikely to find recent references about this. Modern mathematicians usually study the j-invariant first. It turns out that the j-function is algebraic whenever its argument is an imaginary quadratic complex number, essentially because of a phenomenon called complex multiplication, and these singular moduli are all a consequence of that.
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
While it might not be a 100% unambiguous universally agreed set-in-stone definition, there is a relatively widely accepted definition of what constitutes a "closed-form expression". The set of operations and functions to be included in "closed-form expressions" may be somewhat arbitrary, but there exists a somewhat consistent consensus on this list. For example the basic trigonometric functions are usually included in the list, but more exotic functions (like the gamma function or the bessel functions) are not. I don't think it's very useful to start nitpicking about the exact definition, and just go with, for example, what's listed at the wikipedia article. It's also not very useful to engage in trickery like "we define the function smartass(a, b) to be the exact perimeter of an ellipse with radiuses a and b. Therefore the closed-form expression for the perimeter is smartass(a, b)." This would just be a useless circular (hah!) definition.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
Warp wrote:
It's also not very useful to engage in trickery like "we define the function smartass(a, b) to be the exact perimeter of an ellipse with radiuses a and b. Therefore the closed-form expression for the perimeter is smartass(a, b)." This would just be a useless circular (hah!) definition.
It's because of stuff like this that my disposition to reply to any of your questions decreases exponentially over time. If you bothered to look up the references I sent you, you'd see that the elliptic function, which is the answer to your question, is not a smart-ass function, but has many interesting properties, see for example the Jacobi elliptic functions, which are linked to geometry and have very similar properties to trigonometric ones. If there are doubts about its usefulness, they are also linked to elliptic curves, which form the basis of cryptographic algorithms that allow you to post stuff like this.
Warp wrote:
I don't think it's very useful to start nitpicking about the exact definition, and just go with, for example, what's listed at the wikipedia article.
The thing is: it is useful, much more useful than set in stone some limited form of expressions. In the words of Grothendieck, "The introduction of the digit 0 or the group concept was general nonsense too, and mathematics was more or less stagnating for thousands of years because nobody was around to take such childish steps..." For example, for centuries people wanted to find a "closed-form" expression for prime numbers. The difficulty of the problem was just because what they understood by "closed-form" (polynomials, exponentials, etc.) could never work out. And it turns out it is rather simple to generate prime numbers if you do not limit yourself to these kinds of expressions. So, the useless thing, in retrospect, was to use the wrong tools to solve the problem. Here's something that would be much more productive: why not come up with problems that you think are useful, solve them and explain to people how to do it? As far as I know, staying on the sidelines saying something is useless is the easiest thing in the world.
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
p4wn3r wrote:
It's because of stuff like this that my disposition to reply to any of your questions decreases exponentially over time.
No need to worry about it.
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
blackpenredpen time once again: Find the complex solution or solutions to: ex = ln(x)
Player (36)
Joined: 9/11/2004
Posts: 2631
If e^z = ln(z), because e^z and ln(z) are inverse functions, then it is the case that also e^z = z. Then solving using the Lambert W function is rather trivial: ze-z = 1 -ze-z = -1 -z = W(-1) z = -W(-1)
Build a man a fire, warm him for a day, Set a man on fire, warm him for the rest of his life.
Player (36)
Joined: 9/11/2004
Posts: 2631
Here's an older problem that's been giving me some difficulty: With the clarification that if more than one student fits this criteria then one will be selected at random uniformly. The answer to the problem posed in the question is known, and is a large irreducible fraction for the given prompt using number of students = n = 10. I also know the values for n up to 20 using a brute force algorithm in their irreducible fraction forms. However, I've been completely unable to analyze this problem farther. I have no idea how to even set up a recursive definition for the probability of n, let alone a closed form solution. Best of luck!
Build a man a fire, warm him for a day, Set a man on fire, warm him for the rest of his life.
Editor, Expert player (2079)
Joined: 6/15/2005
Posts: 3282
OmnipotentEntity wrote:
If e^z = ln(z), because e^z and ln(z) are inverse functions, then it is the case that also e^z = z. Then solving using the Lambert W function is rather trivial: ze-z = 1 -ze-z = -1 -z = W(-1) z = -W(-1)
While -W(-1) is one solution (I'm not going to talk more about this; go see blackpenredpen if you're interested), this reasoning doesn't explain why there aren't any other solutions, since any z such that e^(e^z) = z (not necessarily e^z=z) could be a candidate solution for e^z = ln(z).
OmnipotentEntity wrote:
The answer to the problem posed in the question is known, and is a large irreducible fraction
Technically, the problem isn't asking for a number. It's asking "Is the teacher correct?". Which is far easier to answer and justify. This site answers the question, and also gives probabilities (up to n=17), for which the student coming forward would have heads; I see you also posted here going up to n=20. As far as calculating the exact probability, I don't have a nice way to figure that out. But to answer the original question, "Is the teacher correct?": No, the student is more likely to have flipped tails. Given that at least one student has neighbors that both flip heads, and a random candidate stepped forward, that student is more likely to have stepped forward the fewer candidates there are, which is more likely when that student has tails. It's related to similar Bayesian paradoxes like the Monty Hall problem and the Boy or Girl paradox.