This is the last post for this project. I will be discussing the optimization opportunities and how I identified them. In the previous post, I discussed my benchmark results for the Aarch64 and x86_64. The timing difference between the two is due to optimizations in the code where x86_64 had optimized code while the Aarch64Continue reading “Fourth Project Post”
Author Archives: Danny Nguyen
Third Project Post
Profiling – FFMPEG Profiling is used to determine which function of the software utilizes the most resources when executing itself. For me, it will most likely be the encoding aspect of the software that takes up the most resources when executing the script above. On x86_64, I wil be using the perf record command, whichContinue reading “Third Project Post”
Second Project Post
To Skip to Profiling Section click here In my previous project post, I talked about the ffmpeg, the test cases I will be using, and the optimization levels. In this post, I will be discussion profiling and how we are going to find out which function has the biggest impact in the amount of resourcesContinue reading “Second Project Post”
First Project Post
The Open Software Package The open software package I have chosen is FFMPEG. This software is used to encode and decode videos. Changing format of a file requires encoding which compresses the video into images then decompresses it. Benchmark Overview The chip architectures I will be using are Aarch64 and the x86_64. I will beContinue reading “First Project Post”
Benchmarking Results
In this post, we will be recording the results of our sample scaling using the 3 approaches, Naive Algorithm, Lookup-based Algorithm, and Fixed-point Algorithm. We will also analyze the results on which algorithm had the best performance using CPU time as the measurement. Results Naive Algorithm Lookup-based Algorithm Fixed-point Algorithm Wall Clock 4000 Wall ClockContinue reading “Benchmarking Results”
Benchmarking
Introduction to Benchmarking In this post, we will be discussing how to benchmark. This will be an experiment on how we can benchmark large amounts of samples (in the billions) using CPU time to measure performance. Also, this experiment is based around 3 approaches, Naive Algorithm, Lookup-based Algorithm, and Fixed-point Algorithm. I will be takingContinue reading “Benchmarking”
Aarch64 Loop Task
Task Objective In this task we were introduced to 64-bit architecture family. This task will provide the basics for understanding the instruction sets to create a program. With this task, we will be using the Aarch64 architecture to create a loop that runs 31 times. Each loop will have a value beside it that indicatesContinue reading “Aarch64 Loop Task”
ELF, Compile, and make/makefile
Executable and Linkable Format (ELF) An ELF file, once the command objdump is called, provides information about the ELF object. It will provide a structure of for binaries, libraries, and core files. These information include file format, memory addresses, disassembly section, instructions called, and bits per instruction in hex format. As an experiment, using Aarch64,Continue reading “ELF, Compile, and make/makefile”
Introduction to ARM
ARMv8 Previously we experimented with an 8-bit 6502 architecture when converting opcode to machine language. In modern technology, such as tablets and android phones, 32-bit and 64-bit architecture are use. Thus, ARM introduced ARMv8 which supports AArch with 32-bit and 64-bit. Design Licensee ARM does not produce chips. ARM designs the chips and licenses theContinue reading “Introduction to ARM”
Adding Calculator
Add Calculator In this post, I will be discussing how we implemented opcode using 6502 chip architecture to take user input of 2 digits and add them together to get a final result. This calculator will take only positive numbers ranging from 0-99. This will be done using a combination of instructions and ROM routinesContinue reading “Adding Calculator”