Compilers Languages: Difference between revisions
BloomWiki: Compilers Languages |
BloomWiki: Compilers Languages |
||
| (2 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
<div style="background-color: #4B0082; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | |||
{{BloomIntro}} | {{BloomIntro}} | ||
Compilers and Programming Languages are the tools that allow humans to speak to computers. While computers only understand binary (0s and 1s), humans use "High-Level" languages like Python, Java, or C++ that are readable and expressive. A '''compiler''' is a complex piece of software that translates this human-friendly code into machine-readable instructions. This field involves the study of formal grammars, lexical analysis, optimization, and the "virtual machines" that run our code. It is the bridge between human logic and hardware execution. | Compilers and Programming Languages are the tools that allow humans to speak to computers. While computers only understand binary (0s and 1s), humans use "High-Level" languages like Python, Java, or C++ that are readable and expressive. A '''compiler''' is a complex piece of software that translates this human-friendly code into machine-readable instructions. This field involves the study of formal grammars, lexical analysis, optimization, and the "virtual machines" that run our code. It is the bridge between human logic and hardware execution. | ||
</div> | |||
== Remembering == | __TOC__ | ||
<div style="background-color: #000080; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | |||
== <span style="color: #FFFFFF;">Remembering</span> == | |||
* '''Programming Language''' — A formal language comprising a set of instructions that produce various kinds of output. | * '''Programming Language''' — A formal language comprising a set of instructions that produce various kinds of output. | ||
* '''Compiler''' — A program that translates code from a high-level language to a lower-level language (like machine code) all at once. | * '''Compiler''' — A program that translates code from a high-level language to a lower-level language (like machine code) all at once. | ||
| Line 16: | Line 21: | ||
* '''Garbage Collection''' — An automatic memory management system that finds and deletes objects that are no longer being used. | * '''Garbage Collection''' — An automatic memory management system that finds and deletes objects that are no longer being used. | ||
* '''Transpiler''' — A compiler that translates from one high-level language to another (e.g., TypeScript to JavaScript). | * '''Transpiler''' — A compiler that translates from one high-level language to another (e.g., TypeScript to JavaScript). | ||
</div> | |||
== Understanding == | <div style="background-color: #006400; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
== <span style="color: #FFFFFF;">Understanding</span> == | |||
The journey from source code to execution happens in a "Compiler Pipeline." | The journey from source code to execution happens in a "Compiler Pipeline." | ||
| Line 24: | Line 31: | ||
'''2. Middle End (Optimization)''': | '''2. Middle End (Optimization)''': | ||
The compiler looks for ways to make the code better. For example, if you wrote | The compiler looks for ways to make the code better. For example, if you wrote <code>x = 5 + 5</code>, the compiler will just change it to <code>x = 10</code> (Constant Folding) so the computer doesn't have to do the math every time the program runs. | ||
'''3. Back End (Code Generation)''': | '''3. Back End (Code Generation)''': | ||
| Line 32: | Line 39: | ||
* '''Compiled''' (C++, Go, Rust): The translation happens once. The user gets a "binary" that is very fast but must be re-compiled for different operating systems. | * '''Compiled''' (C++, Go, Rust): The translation happens once. The user gets a "binary" that is very fast but must be re-compiled for different operating systems. | ||
* '''Interpreted''' (Python, Ruby): The translation happens while the program is running. It's slower, but the same code can run on any computer with the "interpreter" installed. | * '''Interpreted''' (Python, Ruby): The translation happens while the program is running. It's slower, but the same code can run on any computer with the "interpreter" installed. | ||
* '''JIT (Just-In-Time)''' (Java, C#): A hybrid approach where the code is compiled | * '''JIT (Just-In-Time)''' (Java, C#): A hybrid approach where the code is compiled ''as'' it runs, combining the flexibility of interpreters with the speed of compilers. | ||
</div> | |||
== Applying == | <div style="background-color: #8B0000; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
== <span style="color: #FFFFFF;">Applying</span> == | |||
'''Modeling a Simple 'Tokenizer' (Lexer):''' | '''Modeling a Simple 'Tokenizer' (Lexer):''' | ||
<syntaxhighlight lang="python"> | <syntaxhighlight lang="python"> | ||
| Line 71: | Line 80: | ||
; Language Paradigms | ; Language Paradigms | ||
: '''Imperative''' (C, Java) → Telling the computer | : '''Imperative''' (C, Java) → Telling the computer ''how'' to do something (step-by-step instructions). | ||
: '''Declarative / Functional''' (Haskell, SQL) → Telling the computer | : '''Declarative / Functional''' (Haskell, SQL) → Telling the computer ''what'' you want (e.g., "Give me all users over 20"). | ||
: '''Object-Oriented''' (Python, Smalltalk) → Organizing code around "objects" (data + behavior). | : '''Object-Oriented''' (Python, Smalltalk) → Organizing code around "objects" (data + behavior). | ||
</div> | |||
== Analyzing == | <div style="background-color: #8B4500; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
== <span style="color: #FFFFFF;">Analyzing</span> == | |||
{| class="wikitable" | {| class="wikitable" | ||
|+ Statically vs. Dynamically Typed | |+ Statically vs. Dynamically Typed | ||
| Line 89: | Line 100: | ||
|} | |} | ||
'''The Halting Problem''': Alan Turing proved that it is mathematically impossible to write a program that can look at | '''The Halting Problem''': Alan Turing proved that it is mathematically impossible to write a program that can look at ''any'' other program and tell you if it will eventually finish or run forever. This fundamental limit means that compilers can never be "perfect" at predicting every possible outcome of a program. | ||
</div> | |||
== Evaluating == | <div style="background-color: #483D8B; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
Evaluating a language/compiler: | == <span style="color: #FFFFFF;">Evaluating</span> == | ||
Evaluating a language/compiler: | |||
# '''Safety''': Does the language prevent "Memory Leaks" or "Buffer Overflows" (Rust is the leader here)? | |||
# '''Expressiveness''': How much code do you have to write to achieve a task (Python vs. Java)? | |||
# '''Performance''': How close to the "metal" (raw hardware speed) does the compiler get? | |||
# '''Ecosystem''': Are there enough libraries and tools already built for this language? | |||
</div> | |||
== Creating == | <div style="background-color: #2F4F4F; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
Future Frontiers: | == <span style="color: #FFFFFF;">Creating</span> == | ||
Future Frontiers: | |||
# '''WebAssembly (WASM)''': A binary format that allows high-performance languages (like C++ or Rust) to run at near-native speed in a web browser. | |||
# '''Domain Specific Languages (DSLs)''': Creating tiny, specialized languages for specific tasks (like Flutter for UI or TensorFlow for AI). | |||
# '''AI-Enhanced Compilers''': Using deep learning to find even better optimizations that human engineers haven't thought of. | |||
# '''Formal Verification''': Writing compilers that mathematically ''prove'' that the translated machine code matches the intended logic perfectly (critical for aerospace and nuclear systems). | |||
[[Category:Computer Science]] | [[Category:Computer Science]] | ||
[[Category:Programming]] | [[Category:Programming]] | ||
[[Category:Compilers]] | [[Category:Compilers]] | ||
</div> | |||
Latest revision as of 01:49, 25 April 2026
How to read this page: This article maps the topic from beginner to expert across six levels � Remembering, Understanding, Applying, Analyzing, Evaluating, and Creating. Scan the headings to see the full scope, then read from wherever your knowledge starts to feel uncertain. Learn more about how BloomWiki works ?
Compilers and Programming Languages are the tools that allow humans to speak to computers. While computers only understand binary (0s and 1s), humans use "High-Level" languages like Python, Java, or C++ that are readable and expressive. A compiler is a complex piece of software that translates this human-friendly code into machine-readable instructions. This field involves the study of formal grammars, lexical analysis, optimization, and the "virtual machines" that run our code. It is the bridge between human logic and hardware execution.
Remembering[edit]
- Programming Language — A formal language comprising a set of instructions that produce various kinds of output.
- Compiler — A program that translates code from a high-level language to a lower-level language (like machine code) all at once.
- Interpreter — A program that translates and executes code line-by-line (e.g., Python, JavaScript).
- Source Code — The human-readable version of a program.
- Machine Code — The binary instructions executed directly by the CPU.
- Assembly Language — A low-level language that is a human-readable version of machine code.
- Syntax — The set of rules that defines the combinations of symbols that are considered to be a correctly structured document or fragment in that language.
- Lexical Analysis (Lexing) — The first stage of a compiler; breaking code into "tokens" (like keywords, variables, and operators).
- Parsing — The second stage; organizing tokens into a "Syntax Tree" to check if the grammar is correct.
- Optimization — The process where the compiler modifies the code to make it run faster or use less memory.
- Type System — A set of rules that assigns a "type" (like integer or string) to variables to prevent errors.
- Garbage Collection — An automatic memory management system that finds and deletes objects that are no longer being used.
- Transpiler — A compiler that translates from one high-level language to another (e.g., TypeScript to JavaScript).
Understanding[edit]
The journey from source code to execution happens in a "Compiler Pipeline."
1. Front End (Lexing & Parsing): The compiler reads your code and turns it into an Abstract Syntax Tree (AST). This is a map of the logic. If you missed a semicolon, the parser will fail here.
2. Middle End (Optimization):
The compiler looks for ways to make the code better. For example, if you wrote x = 5 + 5, the compiler will just change it to x = 10 (Constant Folding) so the computer doesn't have to do the math every time the program runs.
3. Back End (Code Generation): The final stage turns the optimized logic into the specific binary code for the user's CPU (Intel, ARM, etc.).
Compiled vs. Interpreted:
- Compiled (C++, Go, Rust): The translation happens once. The user gets a "binary" that is very fast but must be re-compiled for different operating systems.
- Interpreted (Python, Ruby): The translation happens while the program is running. It's slower, but the same code can run on any computer with the "interpreter" installed.
- JIT (Just-In-Time) (Java, C#): A hybrid approach where the code is compiled as it runs, combining the flexibility of interpreters with the speed of compilers.
Applying[edit]
Modeling a Simple 'Tokenizer' (Lexer): <syntaxhighlight lang="python"> import re
def tokenize(code):
"""
A toy lexer that identifies keywords and numbers.
"""
tokens = []
# Regex for keywords and integers
patterns = [
('KEYWORD', r'if|else|while|print'),
('NUMBER', r'\d+'),
('OPERATOR', r'[+\-*/=]'),
('IDENTIFIER', r'[a-zA-Z_]\w*'),
('SPACE', r'\s+'),
]
# Combine into one master regex
master_re = '|'.join(f'(?P<{name}>{pattern})' for name, pattern in patterns)
for match in re.finditer(master_re, code):
kind = match.lastgroup
value = match.group()
if kind != 'SPACE':
tokens.append((kind, value))
return tokens
- 'Tokenizing' a simple line of code
code_line = "if x = 10" print(tokenize(code_line))
- This is the very first step every compiler takes.
</syntaxhighlight>
- Language Paradigms
- Imperative (C, Java) → Telling the computer how to do something (step-by-step instructions).
- Declarative / Functional (Haskell, SQL) → Telling the computer what you want (e.g., "Give me all users over 20").
- Object-Oriented (Python, Smalltalk) → Organizing code around "objects" (data + behavior).
Analyzing[edit]
| Feature | Static (C++, Rust) | Dynamic (Python, JS) |
|---|---|---|
| Error Checking | Before running (Compile time) | While running (Runtime) |
| Variable Types | Must be declared (e.g., 'int x') | Flexible (e.g., 'x = 5') |
| Speed | Faster (no runtime checks) | Slower (checking types as it runs) |
| Development Speed | Slower (more rigid) | Faster (more expressive) |
The Halting Problem: Alan Turing proved that it is mathematically impossible to write a program that can look at any other program and tell you if it will eventually finish or run forever. This fundamental limit means that compilers can never be "perfect" at predicting every possible outcome of a program.
Evaluating[edit]
Evaluating a language/compiler:
- Safety: Does the language prevent "Memory Leaks" or "Buffer Overflows" (Rust is the leader here)?
- Expressiveness: How much code do you have to write to achieve a task (Python vs. Java)?
- Performance: How close to the "metal" (raw hardware speed) does the compiler get?
- Ecosystem: Are there enough libraries and tools already built for this language?
Creating[edit]
Future Frontiers:
- WebAssembly (WASM): A binary format that allows high-performance languages (like C++ or Rust) to run at near-native speed in a web browser.
- Domain Specific Languages (DSLs): Creating tiny, specialized languages for specific tasks (like Flutter for UI or TensorFlow for AI).
- AI-Enhanced Compilers: Using deep learning to find even better optimizations that human engineers haven't thought of.
- Formal Verification: Writing compilers that mathematically prove that the translated machine code matches the intended logic perfectly (critical for aerospace and nuclear systems).