LLM Llama Benchmark Tool - Public Work Report
PKC Benchmark Tool MARK, Here's How We've Built It So Far! (Reporter: chatgpt)
This document is a summary of the thoughts and features we've added from the very beginning of the project until now, so you can see it all at a glance. We prepared this for: > Everyone who uses our benchmark tool. We hope everyone finds it an easy and enjoyable read!
Performance Test Video (Korean): [Link Here]
How Far We've Come! (Milestone Summary)
■ v3: Building the Foundation
- Perfect for Desktops!!
- Clean results! Instead of CSV files, we made it display pretty tables and save them directly as HTML files.
■ v4: Solid to the Core!
- (Smart Engine) It automatically detects and uses a good graphics card (CUDA, MPS) if available, otherwise it runs on the CPU.
- (Safety First) We reinforced the program to prevent it from crashing even if issues occur when fetching graphics card information (power, temperature).
- (Automation King) It now finds local model files more intelligently and we've added a filter function when searching for models on Hugging Face for greater convenience!
- (User Experience Up!) We made it easier to enter the server address and added guidance for offline use. The tooltips have also been made more user-friendly.
■ v5: Now with One-Click!
- It now unconditionally creates a dedicated virtual environment (.pkc-venv) to prevent conflicts with other programs.
- On Windows, macOS, or Linux, just one click on a single file handles everything from installation to execution!
■ v5.3 ~ v5.4: A Prettier Interface
- We slimmed down the input fields and aligned them to the right for a much cleaner look.
- The number of prompt input fields was increased to three, and it now shows the number of models currently running for better convenience!
■ v5.5c: Language Policy Update
- Korean/English now switches automatically based on browser settings, no switch needed. Please use your browser's translation feature for other languages!
■ v5.5c + F1: Major Feature Integration!
- We combined Hugging Face model search/download, real-time progress checking, and local model finding into one.
- We kept our original clean charts and convenient prompt input method!
■ v5.6: Now Multilingual!
- We officially support Korean, English, Japanese, and Chinese (Simplified/Traditional).
- It displays automatically based on the browser's language, and you can also specify the language directly in the address bar, like ?lang=en.
■ v5.6.1: Urgent Layout Hotfix!
- We moved the prompt input area to the left and the local model selection area to the right to make it more user-friendly.
- We also widened the spacing in the number input fields to make them easier to click!
So, what can it do now? (Current Features)
- One-Click Execution: Unzip and click one file to run it immediately. Simple, right?
- Model Management: You can search and download models directly from Hugging Face, as well as use models on your own computer.
- Run Benchmark: Enter your desired text in the three prompt fields (blanks are skipped automatically!) and run it. The results will appear in real-time in the Summary, Log, and Chart tabs!
- Save Results: You can save the current screen as an HTML file to use in reports.
- Multilingual Support: Automatically displays in Korean, English, Japanese, or Chinese.
Actually, it has these advantages too! (Internal Quality)
- Smart Device Selection: It automatically uses a good GPU if available, and smoothly switches to the CPU if not.
- Real-time Streaming: Intermediate results and logs appear on the screen immediately without delay. No more frustrating waits!
- Meticulous Performance Measurement: It records all the essential information, including performance (TTFT, TPS), memory, power, and temperature.
- Robust Stability: The program won't crash, even if there are missing drivers or permission issues.
These are the recently changed files (Latest)
- benchmark_server.py — Contains the core server functions, including Hugging Face integration and real-time log transmission.
- benchmark_canvas.html — The screen we see! All design elements like multilingual support and layout are in here.
- install_wizard.py — The wizard file that helps with the one-click installation.
- requirements.txt — A list of the required libraries.
- setup. / start_server. / OneClick_*** — Script files that enable one-click execution.
There are still a few things to fix (Known Issues)
- Screen Tearing: There's an issue where the card order can get slightly mixed up depending on the screen size. We will definitely fix this in the next version (v5.6.2)!
- Finding Missing Translations: If you see any untranslated parts, please let us know! We'll add them right away.
- More Powerful Search Filters: We plan to add features to sort models or filter by file size in the future.
How to use is really simple!
- Unzip the file and run the file that starts with OneClick. Your browser will pop open.
- Feel free to enter any text you want to test in the three prompt fields.
- Press the Run Benchmark button, and you're done! Check the results in real-time on the chart, and when it's finished, press the Save as HTML button.
If you have any questions, feel free to ask!
- PKC Blog: https://pkc0412.tistory.com/
- Email: pkc0412@gmail.com
by: chatgpt
The public version of the benchmark tool will be available on the blog. Scheduled for public release upon completion.
Tags
#LLM #Benchmark #AI #HuggingFace #DeepLearning #MachineLearning #Python #GPU #Tech #OpenSource #DevLog #PerformanceTesting