Skip to content

Commit 75c0bcd

Browse files
committed
Add initial chat support
The chat conversion was done mostly by gpt-oss 20b from the React WebUI. The chats are saved in a Sqlite database. You can view, rename, delete from a "llama.cpp Conversations" side view. There is also a Tools > llama.cpp > New Conversation menu entry. The chats can be saved as Markdown by clicking on the "invisible" document icon. The chats themselves are a list of QLabels that have Markdown rendered as Qt Html4 using the `md4c` library. There is also a QLiteHtml option, but it needs more love, eventually will replace the QLabel version.
1 parent 2d88f9c commit 75c0bcd

71 files changed

Lines changed: 33747 additions & 32 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

3rdparty/md4c/CHANGELOG.md

Lines changed: 604 additions & 0 deletions
Large diffs are not rendered by default.

3rdparty/md4c/CMakeLists.txt

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
2+
cmake_minimum_required(VERSION 3.5)
3+
project(MD4C C)
4+
5+
set(MD_VERSION_MAJOR 0)
6+
set(MD_VERSION_MINOR 5)
7+
set(MD_VERSION_RELEASE 2)
8+
set(MD_VERSION "${MD_VERSION_MAJOR}.${MD_VERSION_MINOR}.${MD_VERSION_RELEASE}")
9+
10+
set(PROJECT_VERSION "${MD_VERSION}")
11+
set(PROJECT_URL "https://github.com/mity/md4c")
12+
13+
14+
option(BUILD_MD2HTML_EXECUTABLE "Whether to compile the md2html executable" ON)
15+
16+
17+
if(WIN32)
18+
# On Windows, given there is no standard lib install dir etc., we rather
19+
# by default build static lib.
20+
option(BUILD_SHARED_LIBS "help string describing option" OFF)
21+
else()
22+
# On Linux, MD4C is slowly being adding into some distros which prefer
23+
# shared lib.
24+
option(BUILD_SHARED_LIBS "help string describing option" ON)
25+
endif()
26+
27+
add_definitions(
28+
-DMD_VERSION_MAJOR=${MD_VERSION_MAJOR}
29+
-DMD_VERSION_MINOR=${MD_VERSION_MINOR}
30+
-DMD_VERSION_RELEASE=${MD_VERSION_RELEASE}
31+
)
32+
33+
set(CMAKE_CONFIGURATION_TYPES Debug Release RelWithDebInfo MinSizeRel)
34+
if("${CMAKE_BUILD_TYPE}" STREQUAL "")
35+
set(CMAKE_BUILD_TYPE $ENV{CMAKE_BUILD_TYPE})
36+
37+
if("${CMAKE_BUILD_TYPE}" STREQUAL "")
38+
set(CMAKE_BUILD_TYPE "Release")
39+
endif()
40+
endif()
41+
42+
43+
if(${CMAKE_C_COMPILER_ID} MATCHES GNU|Clang)
44+
add_compile_options(-Wall -Wextra -Wshadow)
45+
46+
# We enforce -Wdeclaration-after-statement because Qt project needs to
47+
# build MD4C with Integrity compiler which chokes whenever a declaration
48+
# is not at the beginning of a block.
49+
add_compile_options(-Wdeclaration-after-statement)
50+
elseif(MSVC)
51+
# Disable warnings about the so-called unsecured functions:
52+
add_definitions(/D_CRT_SECURE_NO_WARNINGS)
53+
add_compile_options(/W3)
54+
55+
# Specify proper C runtime library:
56+
string(REGEX REPLACE "/M[DT]d?" "" CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG}")
57+
string(REGEX REPLACE "/M[DT]d?" "" CMAKE_C_FLAGS_RELEASE "${CMAKE_C_FLAGS_RELEASE}")
58+
string(REGEX REPLACE "/M[DT]d?" "" CMAKE_C_FLAGS_RELWITHDEBINFO "{$CMAKE_C_FLAGS_RELWITHDEBINFO}")
59+
string(REGEX REPLACE "/M[DT]d?" "" CMAKE_C_FLAGS_MINSIZEREL "${CMAKE_C_FLAGS_MINSIZEREL}")
60+
set(CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} /MTd")
61+
set(CMAKE_C_FLAGS_RELEASE "${CMAKE_C_FLAGS_RELEASE} /MT")
62+
set(CMAKE_C_FLAGS_RELWITHDEBINFO "${CMAKE_C_FLAGS_RELEASE} /MT")
63+
set(CMAKE_C_FLAGS_MINSIZEREL "${CMAKE_C_FLAGS_RELEASE} /MT")
64+
endif()
65+
66+
include(GNUInstallDirs)
67+
68+
add_subdirectory(src)
69+
if (BUILD_MD2HTML_EXECUTABLE)
70+
add_subdirectory(md2html)
71+
endif ()

3rdparty/md4c/LICENSE.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
2+
# The MIT License (MIT)
3+
4+
Copyright © 2016-2024 Martin Mitáš
5+
6+
Permission is hereby granted, free of charge, to any person obtaining a
7+
copy of this software and associated documentation files (the “Software”),
8+
to deal in the Software without restriction, including without limitation
9+
the rights to use, copy, modify, merge, publish, distribute, sublicense,
10+
and/or sell copies of the Software, and to permit persons to whom the
11+
Software is furnished to do so, subject to the following conditions:
12+
13+
The above copyright notice and this permission notice shall be included
14+
in all copies or substantial portions of the Software.
15+
16+
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS
17+
OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
19+
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
21+
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
22+
IN THE SOFTWARE.

3rdparty/md4c/README.md

Lines changed: 297 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,297 @@
1+
2+
# MD4C Readme
3+
4+
* Home: http://github.com/mity/md4c
5+
* Wiki: http://github.com/mity/md4c/wiki
6+
* Issue tracker: http://github.com/mity/md4c/issues
7+
8+
MD4C stands for "Markdown for C" and that's exactly what this project is about.
9+
10+
11+
## What is Markdown
12+
13+
In short, Markdown is the markup language this `README.md` file is written in.
14+
15+
The following resources can explain more if you are unfamiliar with it:
16+
* [Wikipedia article](http://en.wikipedia.org/wiki/Markdown)
17+
* [CommonMark site](http://commonmark.org)
18+
19+
20+
## What is MD4C
21+
22+
MD4C is Markdown parser implementation in C, with the following features:
23+
24+
* **Compliance:** Generally, MD4C aims to be compliant to the latest version of
25+
[CommonMark specification](http://spec.commonmark.org/). Currently, we are
26+
fully compliant to CommonMark 0.31.
27+
28+
* **Extensions:** MD4C supports some commonly requested and accepted extensions.
29+
See below.
30+
31+
* **Performance:** MD4C is [very fast](https://talk.commonmark.org/t/2520).
32+
33+
* **Compactness:** MD4C parser is implemented in one source file and one header
34+
file. There are no dependencies other than standard C library.
35+
36+
* **Embedding:** MD4C parser is easy to reuse in other projects, its API is
37+
very straightforward: There is actually just one function, `md_parse()`.
38+
39+
* **Push model:** MD4C parses the complete document and calls few callback
40+
functions provided by the application to inform it about a start/end of
41+
every block, a start/end of every span, and with any textual contents.
42+
43+
* **Portability:** MD4C builds and works on Windows and POSIX-compliant OSes.
44+
(It should be simple to make it run also on most other platforms, at least as
45+
long as the platform provides C standard library, including a heap memory
46+
management.)
47+
48+
* **Encoding:** MD4C by default expects UTF-8 encoding of the input document.
49+
But it can be compiled to recognize ASCII-only control characters (i.e. to
50+
disable all Unicode-specific code), or (on Windows) to expect UTF-16 (i.e.
51+
what is on Windows commonly called just "Unicode"). See more details below.
52+
53+
* **Permissive license:** MD4C is available under the [MIT license](LICENSE.md).
54+
55+
56+
## Using MD4C
57+
58+
### Parsing Markdown
59+
60+
If you need just to parse a Markdown document, you need to include `md4c.h`
61+
and link against MD4C library (`-lmd4c`); or alternatively add `md4c.[hc]`
62+
directly to your code base as the parser is only implemented in the single C
63+
source file.
64+
65+
The main provided function is `md_parse()`. It takes a text in the Markdown
66+
syntax and a pointer to a structure which provides pointers to several callback
67+
functions.
68+
69+
As `md_parse()` processes the input, it calls the callbacks (when entering or
70+
leaving any Markdown block or span; and when outputting any textual content of
71+
the document), allowing application to convert it into another format or render
72+
it onto the screen.
73+
74+
75+
### Converting to HTML
76+
77+
If you need to convert Markdown to HTML, include `md4c-html.h` and link against
78+
MD4C-HTML library (`-lmd4c-html`); or alternatively add the sources `md4c.[hc]`,
79+
`md4c-html.[hc]` and `entity.[hc]` into your code base.
80+
81+
To convert a Markdown input, call `md_html()` function. It takes the Markdown
82+
input and calls the provided callback function. The callback is fed with
83+
chunks of the HTML output. Typical callback implementation just appends the
84+
chunks into a buffer or writes them to a file.
85+
86+
87+
## Markdown Extensions
88+
89+
The default behavior is to recognize only Markdown syntax defined by the
90+
[CommonMark specification](http://spec.commonmark.org/).
91+
92+
However, with appropriate flags, the behavior can be tuned to enable some
93+
extensions:
94+
95+
* With the flag `MD_FLAG_COLLAPSEWHITESPACE`, a non-trivial whitespace is
96+
collapsed into a single space.
97+
98+
* With the flag `MD_FLAG_TABLES`, GitHub-style tables are supported.
99+
100+
* With the flag `MD_FLAG_TASKLISTS`, GitHub-style task lists are supported.
101+
102+
* With the flag `MD_FLAG_STRIKETHROUGH`, strike-through spans are enabled
103+
(text enclosed in tilde marks, e.g. `~foo bar~`).
104+
105+
* With the flag `MD_FLAG_PERMISSIVEURLAUTOLINKS` permissive URL autolinks
106+
(not enclosed in `<` and `>`) are supported.
107+
108+
* With the flag `MD_FLAG_PERMISSIVEEMAILAUTOLINKS`, permissive e-mail
109+
autolinks (not enclosed in `<` and `>`) are supported.
110+
111+
* With the flag `MD_FLAG_PERMISSIVEWWWAUTOLINKS` permissive WWW autolinks
112+
without any scheme specified (e.g. `www.example.com`) are supported. MD4C
113+
then assumes `http:` scheme.
114+
115+
* With the flag `MD_FLAG_LATEXMATHSPANS` LaTeX math spans (`$...$`) and
116+
LaTeX display math spans (`$$...$$`) are supported. (Note though that the
117+
HTML renderer outputs them verbatim in a custom tag `<x-equation>`.)
118+
119+
* With the flag `MD_FLAG_WIKILINKS`, wiki-style links (`[[link label]]` and
120+
`[[target article|link label]]`) are supported. (Note that the HTML renderer
121+
outputs them in a custom tag `<x-wikilink>`.)
122+
123+
* With the flag `MD_FLAG_UNDERLINE`, underscore (`_`) denotes an underline
124+
instead of an ordinary emphasis or strong emphasis.
125+
126+
Few features of CommonMark (those some people see as mis-features) may be
127+
disabled with the following flags:
128+
129+
* With the flag `MD_FLAG_NOHTMLSPANS` or `MD_FLAG_NOHTMLBLOCKS`, raw inline
130+
HTML or raw HTML blocks respectively are disabled.
131+
132+
* With the flag `MD_FLAG_NOINDENTEDCODEBLOCKS`, indented code blocks are
133+
disabled.
134+
135+
136+
## Input/Output Encoding
137+
138+
The CommonMark specification declares that any sequence of Unicode code points
139+
is a valid CommonMark document.
140+
141+
But, under a closer inspection, Unicode plays any role in few very specific
142+
situations when parsing Markdown documents:
143+
144+
1. For detection of word boundaries when processing emphasis and strong
145+
emphasis, some classification of Unicode characters (whether it is
146+
a whitespace or a punctuation) is needed.
147+
148+
2. For (case-insensitive) matching of a link reference label with the
149+
corresponding link reference definition, Unicode case folding is used.
150+
151+
3. For translating HTML entities (e.g. `&amp;`) and numeric character
152+
references (e.g. `&#35;` or `&#xcab;`) into their Unicode equivalents.
153+
154+
However note MD4C leaves this translation on the renderer/application; as
155+
the renderer is supposed to really know output encoding and whether it
156+
really needs to perform this kind of translation. (For example, when the
157+
renderer outputs HTML, it may leave the entities untranslated and defer the
158+
work to a web browser.)
159+
160+
MD4C relies on this property of the CommonMark and the implementation is, to
161+
a large degree, encoding-agnostic. Most of MD4C code only assumes that the
162+
encoding of your choice is compatible with ASCII. I.e. that the codepoints
163+
below 128 have the same numeric values as ASCII.
164+
165+
Any input MD4C does not understand is simply seen as part of the document text
166+
and sent to the renderer's callback functions unchanged.
167+
168+
The two situations (word boundary detection and link reference matching) where
169+
MD4C has to understand Unicode are handled as specified by the following
170+
preprocessor macros (as specified at the time MD4C is being built):
171+
172+
* If preprocessor macro `MD4C_USE_UTF8` is defined, MD4C assumes UTF-8 for the
173+
word boundary detection and for the case-insensitive matching of link labels.
174+
175+
When none of these macros is explicitly used, this is the default behavior.
176+
177+
* On Windows, if preprocessor macro `MD4C_USE_UTF16` is defined, MD4C uses
178+
`WCHAR` instead of `char` and assumes UTF-16 encoding in those situations.
179+
(UTF-16 is what Windows developers usually call just "Unicode" and what
180+
Win32API generally works with.)
181+
182+
Note that because this macro affects also the types in `md4c.h`, you have
183+
to define the macro both when building MD4C as well as when including
184+
`md4c.h`.
185+
186+
Also note this is only supported in the parser (`md4c.[hc]`). The HTML
187+
renderer does not support this and you will have to write your own custom
188+
renderer to use this feature.
189+
190+
* If preprocessor macro `MD4C_USE_ASCII` is defined, MD4C assumes nothing but
191+
an ASCII input.
192+
193+
That effectively means that non-ASCII whitespace or punctuation characters
194+
won't be recognized as such and that link reference matching will work in
195+
a case-insensitive way only for ASCII letters (`[a-zA-Z]`).
196+
197+
198+
## Documentation
199+
200+
The API of the parser is quite well documented in the comments in the `md4c.h`.
201+
Similarly, the markdown-to-html API is described in its header `md4c-html.h`.
202+
203+
There is also [project wiki](http://github.com/mity/md4c/wiki) which provides
204+
some more comprehensive documentation. However note it is incomplete and some
205+
details may be somewhat outdated.
206+
207+
208+
## FAQ
209+
210+
**Q: How does MD4C compare to other Markdown parsers?**
211+
212+
**A:** Some other implementations combine Markdown parser and HTML generator
213+
into a single entangled code hidden behind an interface which just allows the
214+
conversion from Markdown to HTML. They are often unusable if you want to
215+
process the input in any other way.
216+
217+
Second, most parsers (if not all of them; at least within the scope of C/C++
218+
language) are full DOM-like parsers: They construct abstract syntax tree (AST)
219+
representation of the whole Markdown document. That takes time and it leads to
220+
bigger memory footprint.
221+
222+
Building AST is completely fine as long as you need it. If you don't, there is
223+
a very high chance that using MD4C will be substantially faster and less hungry
224+
in terms of memory consumption.
225+
226+
Last but not least, some Markdown parsers are implemented in a naive way. When
227+
fed with a [smartly crafted input pattern](test/pathological_tests.py), they
228+
may exhibit quadratic (or even worse) parsing times. What MD4C can still parse
229+
in a fraction of second may turn into long minutes or possibly hours with them.
230+
Hence, when such a naive parser is used to process an input from an untrusted
231+
source, the possibility of denial-of-service attacks becomes a real danger.
232+
233+
A lot of our effort went into providing linear parsing times no matter what
234+
kind of crazy input MD4C parser is fed with. (If you encounter an input pattern
235+
which leads to a sub-linear parsing times, please do not hesitate and report it
236+
as a bug.)
237+
238+
**Q: Does MD4C perform any input validation?**
239+
240+
**A:** No. And we are proud of it. :-)
241+
242+
CommonMark specification states that any sequence of Unicode characters is
243+
a valid Markdown document. (In practice, this more or less always means UTF-8
244+
encoding.)
245+
246+
In other words, according to the specification, it does not matter whether some
247+
Markdown syntax construction is in some way broken or not. If it's broken, it
248+
won't be recognized and the parser should see it just as a verbatim text.
249+
250+
MD4C takes this a step further: It sees any sequence of bytes as a valid input,
251+
following completely the GIGO philosophy (garbage in, garbage out). I.e. any
252+
ill-formed UTF-8 byte sequence will propagate to the respective callback as
253+
a part of the text.
254+
255+
If you need to validate that the input is, say, a well-formed UTF-8 document,
256+
you have to do it on your own. The easiest way how to do this is to simply
257+
validate the whole document before passing it to the MD4C parser.
258+
259+
260+
## License
261+
262+
MD4C is covered with MIT license, see the file `LICENSE.md`.
263+
264+
265+
## Links to Related Projects
266+
267+
Ports and bindings to other languages:
268+
269+
* [commonmark-d](https://github.com/AuburnSounds/commonmark-d):
270+
Port of MD4C to D language.
271+
272+
* [markdown-wasm](https://github.com/rsms/markdown-wasm):
273+
Port of MD4C to WebAssembly.
274+
275+
* [PyMD4C](https://github.com/dominickpastore/pymd4c):
276+
Python bindings for MD4C
277+
278+
Software using MD4C:
279+
280+
* [imgui_md](https://github.com/mekhontsev/imgui_md):
281+
Markdown renderer for [Dear ImGui](https://github.com/ocornut/imgui)
282+
283+
* [MarkDown Monolith Assembler](https://github.com/1Hyena/mdma):
284+
A command line tool for building browser-based books.
285+
286+
* [QOwnNotes](https://www.qownnotes.org/):
287+
A plain-text file notepad and todo-list manager with markdown support and
288+
ownCloud / Nextcloud integration.
289+
290+
* [Qt](https://www.qt.io/):
291+
Cross-platform C++ GUI framework.
292+
293+
* [Textosaurus](https://github.com/martinrotter/textosaurus):
294+
Cross-platform text editor based on Qt and Scintilla.
295+
296+
* [8th](https://8th-dev.com/):
297+
Cross-platform concatenative programming language.

0 commit comments

Comments
 (0)