Learning Rust Language Topics
types
- only u8 ints be converted to chars
- .len returns number of bytes, not the amount of characters
- . chars().count() returns amount of chars in string
- add _ to numbers for readability, they are ignored e.g 1_000_000
functions
- skinny arrow indicates return type
- if you specify type of arguments in a function, rust infers the type on variables that you pass to the function
- line without ; indicates value that gets returned
- last line of function is the return
blocks
- capture variables in strings in curlies “var: {var}”
- variable lifetime ends outside of block
logging
- pretty print datastructures other than strings with {:?}
- multiline pretty print with {:#?}
variables
- drill down into class const’s with :: , e.g. u8::MAX
- make variables mutable with mut
- you can’t change types of mut variable
- when shadowing variables in blocks you can recover the original value after the block scope is closed
- variables can be declared uninitialised, and then used within scopes without dying.
memory
- memory on stack needs fixed sizes
- memory on heap can have unknown data length, but it’s slower
- sometimes heap is not even available (embedded devices)
references
- rust reference is a “memory safe” pointer: a reference to “owned” memory
- references point to memory of other value: “borrowing” values
- references are prefixed with &
- let my_reference = &my_value
mem::size_of::<u8>
shows size of typemem::size_of_val(&reference)
shows size of value- functions cannot return references to values that “die” in the function scope
*
dereferences a reference. if it is a mutable reference you can use*
to operate on the value
ownership
- you can pass objects that you own around
- you can pass a references to data
- then function can view the data but not mutate it
- you can have unlimited unmutable references to values (read only)
- you can have only one mutable reference to a value (write)
moving / borrowing
- when passing an owned object to a function (
var: String
), the function takes ownership and value will die at the end of the scope. This is called a “move” as the data is moved to a different scope - when passing a
var: &String
, the value doesn’t move or change ownership, value can be viewed not mutated and will not die - when passing a
var: &mut String
mutable reference, value doesn’t change ownership, doesn’t die and is mutable.
strings
- defining a regular fixed length string results in an &str
- is a reference with known length to memory ref-stir or string slice
- a String object has data on the heap, owns it’s data and has functionality to mutate the data
- format! Macro is like println but creates a String object
floats
- use convenience functions such as
.floor() .ceil() .round .trunc()
copy / clone
- copy types (ints floats bool) always get copied when passed to a Function
- strings don’t get copied but are clone-able
printing
- multiline strings are allowed
- use raw to print loads of escaped characters
println!(r#"string with bunch " of " escapable chars"#)
- use b to print string as bytes
- use
to print hex
- use
to print pointer adress, as in the memory adress the pointer points to
\u{hex}
to print Unicode character- printing can also specify padding alignment and padding character
arrays
- typing arrays type first then length
[&String; 2]
- prefill array with
let mut buffer = [0; 640]
- slicing array with position, count. Exclusive:
&array[0..2]
inclusive&array[0..=2]
vectors
- kind of like String for &str is Vector for &array
Vec<String>
vector of stringsVec<(i32,i32)>
vector of tuple of 2 i32s- all items within vector have the same type
vector.capacity()
shows capacity in vector before value is copied and doubled, usually 4 when there is 1 value.Vec::with_capacity(8)
when you know the future vector size avoids reallocation, which is bettervector: Vec<_> = [1,2,3].into()
- use .extend to merge two vectors into one:
let length: u32 = 2000;
let mut data: Vec<u8> = Vec::new();
data.extend(8.to_le_bytes());
tuples
- tuples can contain multiple different types
- access tuple members with dot:
tuple.0
- destructure vectors with tuple definition
let vector = vec!["one", "two", "three"]
let (a, _, c) = (vector[0],vector[1],vector[2])
match
- match is like a switch but needs to be exhaustive
- code block, fat arrows and “arms” for each case
fn main() {
let my_number: u8 = 5;
match my_number {
0 => println!("it's zero"),
1 => println!("it's one"),
2 => println!("it's two"),
_ => println!("It's some other number"),
}
}
- “match guards” are if statements within a match
- matches also work with tuples, allowing you to check multiple things and match them to a specific case, and catching all other cases
- match stops when it finds a hit
- match always returns the same type
- use @ to create a variable for the match value to use it in a print statement
number @ 13 => println!("{} is unlucky
structs
- unit struct, has nothing:
struct FileDirectory
- unnamed Tuple struct, only has types:
struct Color(u8,u8,u8)
- named struct, has tuple with values:
struct Light { on: bool, brightness: u8, color: Color}
- structuring allows you to not have to assign the values if they are named the same:
let on = true;
let brightness = 50;
let color = (50,50,50);
let light = Light {
on, brightness, color,
}
enum
- use struct if you have many different things, use enum if you have one thing with many options
enum States { Play, Pause, Stop }
- enums combine perfectly with match to name the states:
match state {
ThingsInTheSky::Sun => println!("I can see the sun!"),
ThingsInTheSky::Stars => println!("I can see the stars!")
}
- enums can contain data when created:
enum ThingsInTheSky {
Sun(String), // Now each variant has a string
Stars(String),
}
- import enums to allow us to use the members directly:
fn match_mood(mood: &Mood) -> i32 {
use Mood::*; // We imported everything in Mood. Now we can just write Happy, Sleepy, etc.
let happiness_level = match mood {
Happy => 10, // We don't have to write Mood:: anymore
Sleepy => 6,
NotBad => 7,
Angry => 2,
};
happiness_level
}
- enums members have indexes, get them by casting the value to an i32 if you change them the numbering goes on
- if you want to create a vector with different types, you can use enum with multiple values
loops
loop {}
loops indefinitely until break; is called- you can name loops with
'loop_outer: loop { break 'loop_outer; }
for number in [0..2] { }
to loop, use _ if you don’t need the variable- break can return a value that can be assigned from the loop
implementing structs and enums
- methods: take “self” and mutate on object
- associated functions: don’t take self, are related
enum Mood {
Good,
Bad,
Sleepy,
}
impl Mood {
fn check(&self) {
match self {
Mood::Good => println!("Feeling good!"),
Mood::Bad => println!("Eh, not feeling so good"),
Mood::Sleepy => println!("Need sleep NOW"),
}
}
}
fn main() {
let my_mood = Mood::Sleepy;
my_mood.check();
}
attributes
- added to structs and enums
#[derive(Debug)]
dereferencing
- when using the dot operator . you don’t have to worry about dereferencing a reference:
let reference_item = &item
let double_reference_item = &reference_item
double_reference_item.compary_number(8)
works
generics
- Used if there are multiple types able to be passed to a function
fn print_value<T>(value: T) { println!("{}", value)}
- Generic is usually called T
- specify which trait, for instance debug, is implemented on a generic:
fn print_value <T: Debug>(value: T)
- you can specify that a generic can be one of multiple traits by using +
fn some_func<T: SomeTrait + OtherTrait>(item: T) -> bool {
item.some_function() && item.other_function()
}
- if you want to compare generics, you need
PartialOrd
option
- use option as return type when you deal with values that might exist, or might not exist
- use None and Some in code where you need to return an option
fn take_fifth(value: Vec<i32>) -> Option<i32> {
if (..) {
None
} else {
Some(value[4])
}
}
- the None or Some values can be “unwrapped”. You only want to unwrap if you are sure there is a value inside
- we can use a match on an option to safely deal with the None value
- we can use is_some() on an option to see if it contains a Some value
- we can use unwrap_or() to provide a fallback value if the value was none
- if we match an optional struct, we can use the ref keyword
Some(ref struct)
to get the reference to the value instead of borrowing the value
result and error handling
- result is similar to option, but about Ok or Err instead of no value or value
- take extra care defining the type of a function when defining an result.
- A common mistake is to make result return one of the “Error” types, but you should define the type of the thing that is wrapped in the Err
fn generate_error() -> Result<String, String> {
if(1 == 1) {
Ok("it was 1".to_string())
} else {
Err("wasnt true".to_string())
}
}
- you can define your own errors, similar to this standard library function
enum CreationError {
Negative,
Zero,
}
impl PositiveNonzeroInteger {
fn new(value: i64) -> Result<Self, CreationError> {
// TODO: This function shouldn't always return an `Ok`.
if value < 0 {
return Err(CreationError::Negative);
}
if value == 0 {
return Err(CreationError::Zero);
}
Ok(Self(value as u64))
}
}
- implementing naive error handling:
fn give_result(input: i32) -> Result<(), ()> {
if input % 2 == 0) {
return Ok(())
} else {
return Err("input couldnt be processed")
}
}
fn main() {
if give_result(5).is_ok() {
println!("it was ok");
} else {
println!("it was an error");
}
}
- if you encounter a result, you can
if let
the error case and continue to safely unwrap the value:
let stream = TcpStream::connect(args.host);
if let Err(e) = stream {
eprintln!("Failed to connect to host: {}", e);
return Err(e);
}
let mut stream = stream.unwrap();
- or, use match and handle the error in one block
let mut stream = match TcpStream::connect(args.host) {
Ok(stream) => stream,
Err(e) => {
eprintln!("Failed to connect to host: {}", e);
return Err(e)
}
};
if let and while let
- if you only want to do something when the value is not None:
let my_vec = vec![2, 3, 4];
for index in 0..10 {
if let Some(number) = my_vec.get(index) {
println!("the number is {}", number)
}
}
- you can do something similar with while let, e.g. popping a vector until it returns none
while let Some(information) = city.pop()
hashmap
- hashmaps are not ordered
- create with
HashMap::new()
- populate with
hashmap.insert(key, value)
hashmap.get()
returns an Option- inserting with existing key will overwrite value
- the
.entry()
method returns a mutable reference to the entry if the key exists, allowing you to change its value - if we chain
.entry(key).or_insert(0)
the key with value 0 will be inserted into the hashmap if it didn’t exist yet - we can then use that reference to for instance increase a counter
*reference += 1;
- you can also use or_insert to create vectors with items as values
btreemap
- btreemaps exactly like hashmaps but are sorted by key
- iterate with
for (key, value)
hashset
- a hashmap where every value is ()
- used to keep tabs on whether a key exists or not
- use
hashset.get(&number).is_none()
to check if keys exist
BTreeSet
- same as hashset but ordered
binary heap
- is like a hashset where the first element is always the biggest value, the rest is unordered
- used for instance to create prioritised lists, for example using a
BinaryHeap<(u8, &str)>
tuple for tasks
VeqDeque
- vector made especially for being efficient at popping elements from both front and back of the array.
- use
pop_front()
to pop items on the front of the array pop_back()
andpop_front()
return an option, options need to be.unwrap()
ped
the mighty ? operator
a lot of people just questionmark out the errors really quickly, uyou end up handling an error 10 functions higher, with no indication to where this error happened — theprimagen (https://youtu.be/7ySVWcFHz98?si=33xKgzpYc4YGZAds&t=1039)
- warning: theprimagen warns you for becoming lazy by using ? operator instead of handling errors
- adding question mark operator
?
returns if the result if it was ok and passes error if it was an err - compiler can help us find the err types if we try to call an non-existant function on an item:
let failure = "Not a number".parse::<i32>(); failure.rbrbrb(); // ⚠️ Compiler: "What is rbrbrb()???"
error[E0599]: no method named `rbrbrb` found for enum `std::result::Result<i32, std::num::ParseIntError>` in the current scope
- ? operator is useful when working with files, as each step of loading, writing file can produce errors
panic & assert
- use
panic!("this and this should never happen wrong")
when something unrecoverable happens, to make program end immediately - asserts takes an expression and a message, allowing you to do runtime checks on things that should never happen:
assert!(x > 3, "x should never by higher than 3")
assert_eq!(x, 3)
assert_neq!(x, 4)
- primagen on asserts https://youtu.be/7ySVWcFHz98?si=uxTtx2gD06uDjRIP&t=2114
traits
- declare some behavior, e.g. “what something can do”
- it feels like overwriting a global prototype, but actually the implementation only happens if you import the trait in the scope of your code
- most standard objects implement the
Debug
,Copy
andClone
traits - we can give our own structs these traits by using macro
#[derive(Debug)]
- you can implement traits manually with impl
- define a struct:
struct Animal { name: String }
- define a trait:
trait Barks { fn bark(&self); }
- implement that trait on a struct:
impl Dog for Animal {
fn bark(&self) {
println!("{} is barking", self.name)
}
}`
- another common trait is From, to turn an object into an object of another kind
impl From<Vec<City>> for Country { fn from(cities: Vec<City>) -> Self { Self {cities}} }
trait bounds
- traits can be implemented without functions and used as traitsbounds
- these traitsbounds can then be used to narrow down generics for a function:
impl Magic for Wizard{}
fn fireball<T: Magic + Debug>(character: &T, opponent: &mut Monster, distance: u32) {
...
}
asref and where
- AsRef is a trait gives you a reference to the object as a different type
- you can use
AsRef<str>
is a traitbound to make a function accept more than one type - if you have more than one condition to the generic, you can use the where keyword to split up into multiple lines
fn print_it<T>(input: T)
where
T: AsRef<str> + Debug + Display,
{...}
chaining
- we can create functional style method chains, passing the returned object to the next function
- when starting from a vector, we need to turn into an iterator.
- after finishing, we must
collect
the iterator to turn it back into a collection
let range = 1..=10;
let vec = range.collect::<Vec<u8>>();
println!("numbers 1 to 10: {:?}",vec);
let subvec = vec
.into_iter()
.skip(2)
.take(3)
.collect::<Vec<u8>>();
println!("{:?}",subvec);
- collect also allows us to specify the type into what we want to collect
- collect uses type hints to determine what type to collect to
iterators
- three types of iterators:
.iter()
gives an iterator of immutable references.iter_mut()
gives an iterator of mutable references.into_iter()
gives an iterator of values (not references)
- iter() creates an iterator over immutable references, letting you traverse data without taking ownership. In contrast, into_iter() consumes the collection, transferring ownership and making it possible to modify or transform each element
- a for loop is an iterator that owns its values
- borrowing note: after calling .iter() on a value, we can still access the original value, as we got immutable references and values were not moved
- iterators are lazy, they only work when they are being consumed
- a map without collect doesn’t fire
- use
enumerate()
to get both index values as a tuple - use
char_indices()
on a string to get index and number - use
windows(len, |window| ... )
to iterate over vectors with slices of len
functional
- use
map
to pass each to a new collection- take extra care to not use the ; at the end of statement in a map, or it will not return that value
- use
for_each
to do something to every item - filter, reduce etc. are available
filter_map
first map, and then automatically filter all the None values out.- use .ok() on a method that delivers result to easily chain into a functional call
filter_map(|input| input.parse::<f32>().ok())
closures
- define closures with
|| {}
- declare new variables in a closure:
|x:i32| { }
- you can use variables outside of the closure inside the closure:
let outside = 5 |i:i32| { o+i }
- in addition to unwrap_or we can also use unwrap_or(value) unwrap_or_else(function) to provide a closure to handle an unwrapping on none
- use
|_|
if you dont need to use the variable in a closure - when returning an option, use
.ok_or("error message")
to use the value or pass an error - when dealing with
- TODO dive deeper in
and_then()
✅ 2024-09-23- you can use and_then to immediately attach a callback if a returns an error:
serialisation
- use serde to serialize enums. For example, turning an enum into command line options:
#[serde(rename_all = "kebab-case")]
enum Command {
Friendly,
Fuzzer,
EIP,
Exploit
}
command line arguments
- use clap to parse cli arguments from a struct:
#[derive(Parser, Debug)]
#[command(name = "Exploit-Rust")]
#[command(version, about, long_about = None)]
struct Args {
command: Command,
host: String,
return_address: String
}
fn main() {
let args = Args::parse();
}
simple splitting in files
- move function to separate file, add pub to fn
- use following lines to import function
mod filename;
use crate::interact::interact;
some, and, find, fold
- use some() and all() functions in combination with mapping closure to check vectors for certain values and deal with none values
let some_are_none = vec![Some("yes"), Some("yes"), None];
let result1 = some_are_none
.iter()
.all(|x| x.is_some());
- very useful when dealing with multiple async calls at the same time
- rev() reverses the iteration from the back of the iter
- .find()
- fold() also works with options / none values:
let some_are_none = vec![Some(1), Some(1), None];
let folded_total = some_are_none
.iter()
.fold(0, |total_so_far, next| total_so_far + next.unwrap_or(0));
println!("folded total: {}", folded_total);
debugging
- use dbg! to print debug information
- use
BACKTRACE=1
environment variable to enable traces
lifetimes
- you cannot keep references to values that die
- for strings, this is often fixed by making giving the string a static lifetime:
line: &'static str
- lifetimes are passed just like generics.
struct City<'a>
means inputs should live just as long as the structs
mutating inside structs
- instead of making a full struct mutable, we can use Cell to make only specific variables mutable
- when using refcell we can also create new (mutable) references from a cell
- the borrow checks are only executed on runtime, so compiler will allow a double borrow
mutex
- if a value is declared as a mutex
let my_mutex = Mutex::new(5)
, we can at some point lock it, and mutate it using the created mutex guard object:let mutex_changer = my_mutex.lock().unwrap()
. - locked values are unlocked
- at the end of the lifetime of a mutex changer reference
- or when explicitly dropped:
std::mem::drop(mutex_changer)
- you don’t need to use a mutex changer variable, you can change the mutex immediately, just don’t forget to dereference the value
*my_mutex.lock().unwrap() = 10
rwlock
- is kind of like a mutex and refcell
- many .read is good
- one .write is good
- read and write together is not good
- first drop the writable reference before trying to read
- use try_read and try_write to avoid locking up
cow “clone on write”
- TODO look into COW
rc
- if we want to use heavy objects multiple times, we can use RC to create multiple references to the same value
- TODO look further into rc
multithreading
- use std::thread::spawn that takes a a closure that will run on separate threads
- use the handler and handle.join() to wait for the threads to finish
- use move before closure to move a value into the thread
todo
- use todo!() macro in unimplemented methods
arc “atomic reference counter”
- if you are going to use variables in multiple threads, we first create a mutex for the value and protect it inside an Arc:
let my_number = Mutex::new(5)
let my_number1 = Arc::clone(&my_number)
std::thread::spawn(move || {
...
})
multithreading and chunking
- when dealing with large datasets, use chunking to load part of the data in memory and then use rayon to parallellise the processing
- use Arc’s and AtomicBools to keep track of state amongst threads
let password_found: Arc<AtomicBool> = Arc::new(AtomicBool::new(false));
let mut current_chunk: Vec<String> = Vec::with_capacity(CHUNK_SIZE);
let mut found_password: Option<String> = None;
- then use
par_iter
on the chunk inside a chunk processing function - use
arc.load
andarc.store
withOrdering::Relaxed
to check if other threads already completed, and store the result if something is found something is found
if found.load(std::sync::atomic::Ordering::Relaxed) {
return None;
}
...
if password.is_some() {
found.store(true, Ordering::Relaxed);
}
password.cloned()
defining your own closures
- TODO look into closures in functions
channels
- use
std::sync::mpsc
multiple producer, single consumer → many threads send to one place - a channel is like an arc that can be cloned to other threads
- use a join handle to make them wait
attributes
- typical attributes that are used
#[allow(dead_code)]
#[allow(unused_variables)]
#[derive(Display)]
→ add display functionality#[cfg(test)]
→ only run in test configuration#[no_std]
→ no standard lib, for devices with small memory/space
testing
- define tests inside mod with
cfg(test)
, annotate each test function with#[test]
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn types_returns_char() {
let result = types();
assert_eq!(result, '=');
}
}
box
- box is a smart pointer, it is similar to a reference.
- box allows you to put a type on the heap instead of on the stack
- you can use box if you want to do
- recursion in structs,
- dynamic memory allocation
- dynamic dispatch (trait objects)
- transferable ownership without incurring clone
- define the return type of an error enum
#[derive(PartialEq, Debug)]
enum CreationError {
Negative,
Zero,
}
impl Error for CreationError {}
fn main() -> Result<(), Box<dyn Error>> {
let pretend_user_input = "42";
let x: i64 = pretend_user_input.parse()?;
println!("output={:?}", PositiveNonzeroInteger::new(x)?);
Ok(())
}
dyn
- dyn tells the compiler that the object we are passing is a trait
default
- default is like calling new but without an argument, creating a default value
- you can create a builder pattern with multiple defaults
deref
- deref is similar to using
*
on well known implemented references- Vec points to an [] and implements Deref
- String points to an &str and implements Deref
- you can implement your own
Deref
to allow “smart” pointers to be dereferenced - implement
DerefMut
to also allow mutating the struct
crates
- use
mod
to create a space for your code - helps with structuring and reading your code
- creating private scopes to encapsulate functionality
- only code with pub is publicly available, everything is private by default
- for structs, need pub for each public member to expose
cargo
- use
cargo add <cratename>
to install a crate and add the dependency to the toml file - when you want to use derive macro’s, you have to enable the derive feature:
cargo add serde --features derive
async/await
- await is a tool for writing async functions that look like synchronous code
- in asynchronous programming, operations that cannot complete immediately are suspended to the background, and the thread can continue doing other things. when the operation completes, the task is unsuspended and continues processing
- calls to
.await
yield control back to the thread - async bodies are lazy, it will only run when await is called on it.
- async closures can have values moved to them, just like normal closures
- tokio crate is most used, smallest alternative is
smol
future
- async code implements a trait called “Future”
- use
((),()) = futures::join!()
to join multiple futures and return when they are finished - async functions need to be executed by a runtime
- use runtime:
futures::executor::block_on(future)
to call an async function
- use runtime:
net
- use
.to_socket_addrs
on adomain:port
string to turn it into a socket address
WIP: a tour of the standard lib (https://dhghomon.github.io/easy_rust/Chapter_60.html)