Skip to main content

Number Theory Basics

Number theory studies integers and their divisibility structure. In discrete mathematics it supplies proof examples, efficient algorithms, and the arithmetic foundation for cryptography. The central objects are divisibility, primes, greatest common divisors, and integer representations.

The topic is both ancient and computational. Divisibility statements train proof writing; Euclid's algorithm is one of the oldest efficient algorithms; prime factorization leads directly to modern public-key cryptography. The habit to build here is to move between equations, divisibility notation, and algorithms without losing track of hypotheses such as "nonzero" or "positive."

Definitions

For integers aa and bb with a0a\ne0, aa divides bb, written aba\mid b, if there is an integer kk such that b=akb=ak. If no such integer exists, write aba\nmid b. Divisibility is a statement about exact integer multiples, not approximate division.

The division algorithm says that for every integer aa and positive integer dd, there are unique integers qq and rr such that

a=dq+r,0r<d.a=dq+r,\qquad 0\le r<d.

Here qq is the quotient and rr is the remainder. This theorem is the basis for modular arithmetic and for Euclid's algorithm.

An integer p>1p\gt 1 is prime if its only positive divisors are 11 and pp. An integer greater than 11 that is not prime is composite. The number 11 is neither prime nor composite.

The greatest common divisor of integers aa and bb, not both zero, is the largest positive integer dividing both. It is written gcd(a,b)\gcd(a,b). The least common multiple of nonzero integers aa and bb is the smallest positive integer divisible by both, written lcm(a,b)\operatorname{lcm}(a,b).

Integers aa and bb are relatively prime if gcd(a,b)=1\gcd(a,b)=1. A base-bb expansion of a positive integer nn is an expression

n=akbk+ak1bk1++a1b+a0,n=a_kb^k+a_{k-1}b^{k-1}+\cdots+a_1b+a_0,

where each digit satisfies 0ai<b0\le a_i\lt b and ak0a_k\ne0.

Key results

Basic divisibility facts:

abac    a(mb+nc)a\mid b \land a\mid c \implies a\mid (mb+nc)

for all integers m,nm,n. Proof: if b=asb=as and c=atc=at, then mb+nc=a(ms+nt)mb+nc=a(ms+nt), and ms+ntms+nt is an integer.

The fundamental theorem of arithmetic states that every integer greater than 11 can be written uniquely as a product of primes, up to the order of the factors. This is why prime factorization can be used to compute gcds, lcms, and divisibility.

If nn is composite, then nn has a prime divisor at most n\sqrt n. Proof: write n=abn=ab with 1<a<n1\lt a\lt n and 1<b<n1\lt b\lt n. If both aa and bb were greater than n\sqrt n, then ab>nab\gt n, impossible. So one factor is at most n\sqrt n, and that factor has a prime divisor at most itself.

Euclid's algorithm rests on

gcd(a,b)=gcd(b,amodb).\gcd(a,b)=\gcd(b,a\bmod b).

If a=bq+ra=bq+r, then a common divisor of aa and bb also divides r=abqr=a-bq, and a common divisor of bb and rr also divides a=bq+ra=bq+r. Thus the common divisors are the same.

There are infinitely many primes. Suppose there were only finitely many, p1,,pkp_1,\dots,p_k. Let

N=p1p2pk+1.N=p_1p_2\cdots p_k+1.

No listed prime divides NN, because division by any pip_i leaves remainder 11. But N>1N\gt 1, so by prime factorization it has a prime divisor. That prime is not on the list, contradiction.

For positive integers a,ba,b,

ab=gcd(a,b)lcm(a,b).ab=\gcd(a,b)\operatorname{lcm}(a,b).

This follows by comparing prime exponents: the gcd takes minimum exponents, the lcm takes maximum exponents, and the sum of the minimum and maximum is the sum of the two original exponents.

Visual

TaskMain toolKey bound or fact
test divisibilitydefinition b=akb=akquotient must be integer
find quotient and remainderdivision algorithm0r<d0\le r\lt d
test primality by trial divisionprime divisor boundtest primes n\le\sqrt n
compute gcdEuclidean algorithmrepeated remainders decrease
compute lcmgcd-lcm identitylcm(a,b)=ab/gcd(a,b)\operatorname{lcm}(a,b)=ab/\gcd(a,b) for positive a,ba,b
convert baserepeated divisionremainders become digits

Worked example 1: Use the Euclidean algorithm and back-substitution

Problem. Compute gcd(252,198)\gcd(252,198) and express it as an integer combination of 252252 and 198198.

Method.

  1. Apply the Euclidean algorithm:
252=1981+54,198=543+36,54=361+18,36=182+0.\begin{aligned} 252 &= 198\cdot1+54,\\ 198 &= 54\cdot3+36,\\ 54 &= 36\cdot1+18,\\ 36 &= 18\cdot2+0. \end{aligned}
  1. The last nonzero remainder is 1818, so gcd(252,198)=18\gcd(252,198)=18.
  2. Back-substitute:
18=5436.18=54-36.
  1. Substitute 36=19854336=198-54\cdot3:
18=54(198543)=454198.\begin{aligned} 18 &=54-(198-54\cdot3)\\ &=4\cdot54-198. \end{aligned}
  1. Substitute 54=25219854=252-198:
18=4(252198)198=42525198.\begin{aligned} 18 &=4(252-198)-198\\ &=4\cdot252-5\cdot198. \end{aligned}

Checked answer. gcd(252,198)=18\gcd(252,198)=18, and 18=4252519818=4\cdot252-5\cdot198. The combination confirms that any common divisor of 252252 and 198198 must divide 1818.

Worked example 2: Test primality and factor a number

Problem. Determine whether 221221 is prime. If it is composite, factor it.

Method.

  1. Compute the square-root cutoff:
22114.86.\sqrt{221}\approx14.86.
  1. Test primes at most 14.8614.86: 2,3,5,7,11,132,3,5,7,11,13.
  2. 221221 is odd, so 22212\nmid221.
  3. Sum of digits is 2+2+1=52+2+1=5, so 32213\nmid221.
  4. It does not end in 00 or 55, so 52215\nmid221.
  5. Check 77: 731=2177\cdot31=217, so 72217\nmid221.
  6. Check 1111: 1120=22011\cdot20=220, so 1122111\nmid221.
  7. Check 1313: 1317=22113\cdot17=221.

Checked answer. 221221 is composite, and 221=1317221=13\cdot17. The search could stop as soon as the factor 1313 was found. If no prime up to 221\sqrt{221} had divided it, then 221221 would have been prime.

Code

def gcd(a, b):
a, b = abs(a), abs(b)
while b:
a, b = b, a % b
return a

def factor(n):
result = []
d = 2
while d * d <= n:
while n % d == 0:
result.append(d)
n //= d
d += 1 if d == 2 else 2
if n > 1:
result.append(n)
return result

def to_base(n, base):
if n == 0:
return [0]
digits = []
while n:
n, r = divmod(n, base)
digits.append(r)
return digits[::-1]

print(gcd(252, 198))
print(factor(221))
print(to_base(241, 2))

The factor function is trial division. It is suitable for small examples and for learning the logic, but large-scale factoring requires much more sophisticated algorithms.

Common pitfalls

  • Calling 11 prime. It is neither prime nor composite; otherwise unique factorization would fail.
  • Forgetting the condition a0a\ne0 in the definition of aba\mid b.
  • Using the division algorithm with a negative divisor. The standard statement takes the divisor dd to be positive.
  • Stopping trial division too late or too early. Testing primes up to n\sqrt n is enough; testing only to n/2n/2 is unnecessary.
  • Confusing gcd with lcm in prime factorizations. Gcd uses minimum exponents; lcm uses maximum exponents.
  • Treating Euclid's algorithm as a black box without preserving the back-substitution equations needed for Bezout coefficients.

Divisibility proofs should always return to the definition. To prove aba\mid b, exhibit an integer kk with b=akb=ak. To disprove it, show no such integer can exist, often by using remainders, parity, or bounds. Manipulating the vertical bar as if it were an arithmetic fraction is a common source of invalid reasoning.

The division algorithm is also the reason remainders are well-defined. For a positive divisor dd, every integer falls into exactly one of the dd remainder classes 0,1,,d10,1,\dots,d-1. Negative dividends still have nonnegative remainders in the standard convention: for example, 11=3(4)+1-11=3(-4)+1, so the remainder on division by 33 is 11.

Prime factorization gives a second way to compute gcd and lcm. If

a=piαi,b=piβi,a=\prod p_i^{\alpha_i},\qquad b=\prod p_i^{\beta_i},

then

gcd(a,b)=pimin(αi,βi),lcm(a,b)=pimax(αi,βi).\gcd(a,b)=\prod p_i^{\min(\alpha_i,\beta_i)},\qquad \operatorname{lcm}(a,b)=\prod p_i^{\max(\alpha_i,\beta_i)}.

This method is conceptually clear but requires factoring, which may be expensive for large integers. Euclid's algorithm avoids full factorization and is much more efficient for gcd computation.

Base expansion is another application of repeated division. Dividing by the base gives the least significant digit as a remainder; repeating on the quotient gives the next digit. The digits appear in reverse order of discovery, which is why base-conversion algorithms usually collect remainders and then reverse the list.

When doing Euclidean algorithm work by hand, keep the quotient-remainder equations aligned and do not overwrite them. The same equations used to find the gcd are needed for back-substitution. Losing one equation often forces the computation to be repeated from the beginning.

For primality testing, test divisibility only by primes up to the square root, not every integer. If a composite number has a factor larger than n\sqrt n, the matching factor is smaller than n\sqrt n. This is the reason trial division can stop at the square-root boundary. It is also the reason finding no small prime divisor proves primality.

When using prime factorizations, keep the same prime list for both numbers by allowing exponent 00. For example, 12=22315012=2^2 3^1 5^0 and 20=22305120=2^2 3^0 5^1. Then minimum and maximum exponents can be read componentwise for gcd and lcm without forgetting primes that appear in only one number.

Connections