N: An Experimental Multithreaded bignum Library for Windows

Michael Chourdakis

4.85/5 (20 votes)

Sep 6, 2022

CPOL

3 min read

18279

264

This class allows math between any size big number, as long as your available memory can handle it.

Download source code - 5.3 KB

Introduction

Yes this is not optimal, windows only, not fully complete. Why bother?

Dunno, just because both you and me are interested in low level stuff. Keep reading!

Background

Numbers are nice, but they can only be as large as 2^64 in a form of unsigned long long. Sometimes, you need math between much larger numbers (for example, in cryptography). This class allows math between any size big number, as long as your available memory can handle it.

Multicore Library

For multicore operations, the library uses my own tpoollib, which uses the new Vista+ API for threadpools.

The N Class Internals

N stores all its digits in a vector<D> (D being a typedef for int at the moment), as a decimal array, a base value (default 10) to know what base system the digits are, and a bool flag to keep the sign value. Construction is done via either an integer, or a string:

N n(5LL);
N n(-12438LL);
N n("384747274747282882818373"); // If a non digit is encountered, parsing stops
N n("-274727372737");
N n("111",2); // n = 7

N has also a member function ParseBase:

N n;
n.ParseBase("FFFF",16);  // n = 65535

N Representation and Access

In the debug version of the library, after each operation that changes N, a special function is called to aid you in debugging.

N::operator[](size_t) gives the specific digit from the right, for example if n = 12345 then n[0] is 5. If the specified digit exceeds the number, the function does not crash, but returns 0 instead, allowing itself to be used when adding or multiplying.

Function s(unsigned long long base = b); returns a string representation of N, in the specified base system.

Function tobase(unsigned long long nb) returns a new N which stores the same value in a different base system.

Function upperpart() and lowerpart() return a subpart of the number:

    N a("12345");
    auto a2 = a.upperpart(2); // a2 = 12
    auto a3 = a.lowerpart(3); // a3 = 345
    a.Set(16LL);
    auto str = a.s();      // str = "16"
    str = a.tobase(2).s(); // str = "10000"
    str = a.tobase(16).s(); // str = "10"
    str = a.tobase(10).s(); / str = "16"

Single Add

Adding two same-signed Ns (via operators or function calls or subtracting different-sign Ns) eventually calls:

static N w_add(const N& n1, const N& n2);

This function loops through all digits of the numbers and adds them with carry:

        static N w_add(const N& n1, const N& n2)
            {
            if (n1.b != n2.b)
                return w_add(n1, n2.tobase(n1.b));
            if (n1.m != n2.m)
                {
                if (n1.m)
                    return w_subx(n2, n1.negative());
                return w_subx(n1, n2.negative());
                }
            if (n1.n.empty()) return n2;
            if (n2.n.empty()) return n1;

            N n;
            n.ChangeInternalBase(n1.b);
            n.n.reserve(max(n1.NumDigits(), n2.NumDigits()));
            D carry = 0;

            if (n1.m && n2.m)
                n.m = true;

            size_t j = 0;
            for (size_t i = 0; i < n1.NumDigits() || i < n2.NumDigits() ; i++)
                {
                j = i;
                D sum = n1[i] + n2[i] + carry;
                carry = 0;
                if (sum >= n1.b)
                    {
                    carry = 1;
                    sum -= n1.b;
                    }
                n.n.push_back(sum);
                }
            n.n.push_back(carry);
            n.RemoveZeroes();
            }

Single Sub

Subtraction eventually calls function w_subx():

        static N w_subx(const N& n1, const N& n2)
            {
            if (n1.m != n2.m)
                return w_add(n1, n2.negative());
                
            if (n1.absolute() < n2.absolute())
                return w_subx(n2, n1).negative();

            if (n2.IsZero())
                return n1;
            if (n1.IsZero())
                return n2.negative();

            N n;
            n.ChangeInternalBase(n1.b);
            n.m = n1.m;
            int carry = 0;

            for (size_t i = 0; i < n1.NumDigits() || i < n2.NumDigits(); i++)
                {
                signed long long sum = n1[i] - n2[i] + carry;
                carry = 0;
                if (sum < 0)
                    {
                    sum = n1.b + sum;
                    carry = -1;
                    }
                n.n.push_back(sum);
                }
            n.n.push_back(carry);
            n.RemoveZeroes();
            return n;

            }

Logical Operations

w_logical() is called for operations like AND, OR and XOR. It always converts the numbers to binary. Note that operator ^ by default executes w_pow(), unless both operands are in binary format.

        static N w_logical(const N& n1, const N& n2,int x)
            {
            if (n1.b != 2)
                return w_logical(n1.tobase(2),n2,x);
            if (n2.b != 2)
                return w_logical(n1, n2.tobase(2),x);

            N n;
            n.ChangeInternalBase(2);
            n.n.reserve(max(n1.NumDigits(), n2.NumDigits()));

            for (size_t i = 0; i < n1.NumDigits() || i < n2.NumDigits(); i++)
                {
                D sum = 0;
                if (x == 0) sum = n1[i] & n2[i];
                if (x == 1) sum = n1[i] | n2[i];
                if (x == 2) sum = n1[i] ^ n2[i];
                n.n.push_back(sum);
                }
            n.RemoveZeroes();
            return n;
            }

    N a("10",2);
    N b("11",2);
    auto r = a | b; // r = 5

In addition, cshl(), cshr(), shl2(), shl2(), negative() and absolute() perform the relevant logical operations. shl() and shr() perform self-shl/shr, where the other functions generate a new N.

Multiplication

The multiplication function creates a vector<N> with all multiplications, then adds them together:

        static N w_mul(const N& n1, const N& n2)
            {
            if (n1.b != n2.b)
                return w_mul(n1, n2.tobase(n1.b));
            N n;
            n.ChangeInternalBase(n1.b);
            vector<N> addiz;
            for (size_t i = 0; i < n1.n.size(); i++)
                {
                D d1 = n1.n[i];
                N addi;
                addi.n.reserve(i + n2.n.size());
                for (size_t j = 0; j < i; j++)
                    addi.n.push_back(0);
                D carry = 0;
                for (size_t y = 0; y < n2.n.size(); y++)
                    {
                    D d2 = n2.n[y];
                    D dm = (d1*d2) + carry;
                    carry = 0;
                    carry = dm / n1.b;
                    dm %= n1.b;
                    addi.n.push_back(dm);
                    }
                addi.n.push_back(carry);
                addi.RemoveZeroes();
                addiz.push_back(addi);
                }
            for (auto& a : addiz)
                n += a;
            if (n1.m != n2.m)
                n.m = true;
            return n;
            }

Division

The division function (used by division stuff and operators like /,%,/=,%= etc) returns a tuple of the result and the remainder:

        static tuple<N, N> w_div(const N& n1, const N& n2,bool NoChangeBase = false)
            {
            if (n1.b != n2.b && NoChangeBase == false)
                return w_div(n1.b, n2.tobase(n1.b));
            if (n2 > n1)
                {
                N res = n1;
                return std::make_tuple<N, N>(0LL, std::forward<N>(res));
                }
            if (n2 == n1)
                return std::make_tuple<N, N>(1LL, 0LL);

            N rem = n1;
            N res;
            res.ChangeInternalBase(n1.b);

            for (;;)
                {
                auto nd2 = n2.NumDigits();
                auto upper = rem.upperpart(nd2);
                if (upper < n2)
                    {
                    nd2++;
                    upper = rem.upperpart(nd2);
                    if (upper < n2)
                        {
                        // End...
                        return std::make_tuple<N, N>(forward<N>(res), forward<N>(rem));
                        }
                    }

                // This needs optimization
                unsigned long long js = 9;
                N m1;
                for (; js >= 1; js--)
                    {
                    m1 = w_mul(n2, js);
                    if (m1 < upper)
                        break;
                    }             

                res.n.insert(res.n.begin(),js);
                upper -= m1;
                upper.shl(rem.NumDigits() - nd2);
                upper += rem.lowerpart(rem.NumDigits() - nd2);
                rem = upper;
                }
            }

Power

Nothing much complicated.

        static N w_pow(const N& n1, const N& n2)
            {
            N z = n1;
            if (n2 == 0ll)
                return 1ll;
            if (n2 == 1ll)
                return n1;
            if (n1 == 1ll)
                return 1ll;

            for (N j = 1ll; j < n2; ++j)
                z *= n1;
            return z;
            }

    N a("-5");
    auto a2 = N::w_pow(a, 3LL); // a2 = -125

Multicore Operations

A multiple core adding accepts a vector of Ns, and adds them in pairs, then calls itself with the result until there are only 2 numbers to be added:

        static N w_add2(vector<N>& n, tpoollib::tpool<>& pool)
            {
            if (n.size() == 0)
                return 0LL;
            if (n.size() == 1)
                return n[0];
            if (n.size() == 2)
                {
                N res = n[0];
                res += n[1];
                return res;
                }
            struct Z
                {
                N* n1;
                N* n2;
                N* res;
                };
            vector<Z> z(n.size());
            vector<N> res(n.size() / 2);

            for (size_t i = 0; i < n.size(); i += 2)
                {
                if (i == (n.size() - 1))
                    break; // odd number of additions

                auto a = [](PTP_CALLBACK_INSTANCE, PVOID j, PTP_WORK)
                    {
                    Z* z = (Z*)j;
                    *z->res = w_add(*z->n1,*z->n2);
                    };
                Z& zz = z[i];
                zz.n1 = &n[i];
                zz.n2 = &n[i + 1];
                zz.res = &res[i / 2];

                auto wo = pool.CreateItem<PTP_WORK, PTP_WORK_CALLBACK>(a, (PVOID)&zz);
                pool.RunItem(wo);
                }
            pool.Join();
            if (n.size() % 2)
                res.push_back(n[n.size() - 1]);
            return w_add2(res,pool);
            }

It uses a temporary struct Z to pass it to the threadpool function.

A multiple core multiplication creates a vector of multiplications, then adds them with w_add2():

        static N w_mul2(const N& n1, const N& n2, tpoollib::tpool<>& pool)
            {
            size_t muls = n1.NumDigits() * n2.NumDigits();
            vector<N> a;
            a.reserve(muls);
            for (size_t i = 0; i < n1.NumDigits(); i++)
                {
                for (size_t ii = 0; ii < n2.NumDigits(); ii++)
                    {
                    N rr;
                    D d1 = n1[i];
                    D d2 = n2[ii];
                    unsigned long long r = d1 * d2;
                    rr = r;
                    rr.shl(ii + i);
                    a.push_back(rr);
                    }
                }
            return w_add2(a,pool);
            }

This can take optimizations, for example to find the sums within a multicore for loop itself.

A multiple core power just uses w_mul2():

        static N w_pow2(const N& n1, const N& n2,tpoollib::tpool<>& pool)
            {
            N z = n1;
            if (n2 == 0ll)
                return 1ll;
            if (n2 == 1ll)
                return n1;
            if (n1 == 1ll)
                return 1ll;

            for (N j = 1ll; j < n2; ++j)
                z = w_mul2(z, n1,pool);
            return z;
            }

This can take optimizations as well.

The Code

All included in one header file, feel free to enhance!

History

11^th August, 2016: First release