• R/O
  • SSH

提交

标签
No Tags

Frequently used words (click to add to your profile)

javac++androidlinuxc#windowsobjective-ccocoa誰得qtpythonphprubygameguibathyscaphec計画中(planning stage)翻訳omegatframeworktwitterdomtestvb.netdirectxゲームエンジンbtronarduinopreviewer

Commit MetaInfo

修订版f02f0584978b96c68bf30c7836aa6675fefcce7e (tree)
时间2022-08-11 23:10:02
作者Albert Mietus < albert AT mietus DOT nl >
CommiterAlbert Mietus < albert AT mietus DOT nl >

Log Message

asis

更改概述

差异

diff -r 127f37d96bce -r f02f0584978b CCastle/1.Usage/7.BusyCores.rst
--- a/CCastle/1.Usage/7.BusyCores.rst Sat Jul 23 20:45:18 2022 +0200
+++ b/CCastle/1.Usage/7.BusyCores.rst Thu Aug 11 16:10:02 2022 +0200
@@ -226,8 +226,9 @@
226226 At the same time it should run efficiently. The goals is **not** to keep all cores *busy*, but to use (all) cores to
227227 gain maximal speed up.
228228
229-Distributing tasks over multiple cores (or even distributed computers [#web]_ is not new. In :ref:`ConcurrentComputingConcepts` we
230-analyse some, before we design & describe Castle’s :ref:`XXX`.
229+Distributing tasks over multiple cores (or even distributed computers) [#web]_ is not new. In
230+:ref:`ConcurrentComputingConcepts` we describe some (common, theoretical) concepts, and in :ref:`ActorAbstractions`
231+analyse a powerfull approach. That we like to use for CCastle.
231232
232233 ----------
233234
diff -r 127f37d96bce -r f02f0584978b CCastle/2.Analyse/5b.short_FSM-vs-Grammar.rst
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/CCastle/2.Analyse/5b.short_FSM-vs-Grammar.rst Thu Aug 11 16:10:02 2022 +0200
@@ -0,0 +1,24 @@
1+FSMs vs Grammars
2+=================
3+
4+In case you are wondering: *“Are FSMs and Grammars related, or the same?”*
5+
6+The are related, but not equal.
7+
8+As discovered by `Chomsky <https://en.wikipedia.org/wiki/Noam_Chomsky>`__ there is a `hierarchy <https://en.wikipedia.org/wiki/Chomsky_hierarchy>`__ of languages (or actually grammars). As Chomsky was studying all kind of languages, include natural “human” language, his perception differs from our interpretation. |BR| When we (SW/Engineers) speak about grammars, we typically refer to “Context free grammars” (`CFG <https://en.wikipedia.org/wiki/Context-free_grammar>`__) -- Chomsky classify them as “Type-2”.
9+
10+A non-deterministic PushDown machine (`PDA <https://en.wikipedia.org/wiki/Pushdown_automaton>`__) is needed to
11+recognise a “Type-2” (CFG) Grammar. (Which is a simple version of a stack-machine, which is lot more restricted then a Turing Machine).
12+|BR|
13+And as a FSM is one of the most restricted machines that exist --it can recognise only “Type-3” `Regular grammar
14+<https://en.wikipedia.org/wiki/Regular_grammar>`__-- a FSM can’t be used to recognise a (CFG) grammar.
15+
16+.. seealso::
17+
18+ .. postlist::
19+ :category: Castle
20+ :tags: FSM
21+
22+ .. postlist::
23+ :category: Castle
24+ :tags: Grammar
diff -r 127f37d96bce -r f02f0584978b CCastle/2.Analyse/8.ConcurrentComputingConcepts.rst
--- a/CCastle/2.Analyse/8.ConcurrentComputingConcepts.rst Sat Jul 23 20:45:18 2022 +0200
+++ b/CCastle/2.Analyse/8.ConcurrentComputingConcepts.rst Thu Aug 11 16:10:02 2022 +0200
@@ -15,7 +15,7 @@
1515 :need:`U_ManyCore`].
1616 |BR|
1717 We also discussed threads_: they do not scale well for CPU-bound (embedded) systems. And, I introduced some
18- contemporary abstractions; which do not always fit nicely in existing languages.
18+ (other) concurrency abstractions; which do not always fit nicely in existing languages.
1919
2020 As Castle is a new language we have the opportunity to select such a concept and incorporate it into the language ...
2121 |BR|
@@ -56,8 +56,8 @@
5656 sequential execution is allowed too.
5757
5858
59-Parallel
60---------
59+Parallelism
60+-----------
6161
6262 Parallelism_ is about executing multiple tasks (seemingly) at the same time. We will focus running multiple concurrent
6363 task (of the same program) on “as many cores as possible”. When we assume a thousand cores, we need a thousand
@@ -79,25 +79,35 @@
7979 analogy [#DistributedDiff]_, and we should use the know-how that is available, to design out “best ever language”.
8080
8181
82-Communication
83-=============
82+Efficient Communication
83+=======================
8484
8585 When multiple tasks run in concurrently, they have to communicate to pass data and to controll progress. Unlike in a
86-sequential program -- where the controll is trivial, as sharing data-- this needs a bit of extra effort.
86+sequential program -- where the controll is trivial, as is sharing data-- this needs a bit of extra effort.
8787 |BR|
88-There are two main approches: shared-data of message-passing.
88+There are two main approches: shared-data of message-passing; we will introduce them below.
89+
90+Communication takes time, especially *wall-time* [#wall-time]_ (or real time) and may slow down computing. Therefore
91+communication has to be efficient. This is a hard problem and becomes harder when we have more communication, more
92+concurrently (& parallel) tasks, and/or those task are shorter living. Or better: is depends on the ratio
93+communication-time over time-between-two-communications.
94+
8995
9096 Shared Memory
9197 -------------
9298
9399 In this model all tasks (usually threads or process) have some shared/common memory; typically “variables”. As the acces
94100 is asynchronous, the risk exist the data is updated “at the same time” by two or more tasks. This can lead to invalid
95-data; and so Critical-Sections_ are needed.
101+data and so Critical-Sections_ are needed.
96102
97103 This is a very basic model which assumes that there is physically memory that can be shared. In distributed systems this
98104 is uncommon; but for threads it’s straightforward. As disadvantage of this model is that is hazardous: Even when a
99105 single access to such a shared variable is not protected by a Critical-Section_, the whole system can break [#OOCS]_.
100106
107+The advantage of shared memory is the short communication-time. The wall-time and CPU-time are roughly the same: the
108+time to write & read the variable, added to the (overhead) time for the critical section -- which is typical the
109+bigger part.
110+
101111 About Critical Sections
102112 ~~~~~~~~~~~~~~~~~~~~~~~
103113 For those, who are not familiar with Critical-Sections_ and/or Semaphores, a very short intro.
@@ -107,7 +117,7 @@
107117 that happens when writing or reading a variable and the other thread also access the same shared-memory, the result is
108118 unpredictable. To prevent that, we need to controle the handling that variable: make it a Critical-Section_.
109119
110-In essense, we have to tell the “computer” to make that line (of a few lines) atomic; make it exclusive for a task. Then
120+In essense, we have to tell the “computer” that a line (of a few lines) are *atomic*; to make access exclusive The
111121 the compiler will add some extra fundamental instructions (specific for that CPU-type) to assure this. A check is
112122 inserted just before the section is entered, and the thread will be suspended when another task is using it. When
113123 granted acces, a bit of bookkeeping is done -- so the “check” in other thread is halted). That bookkeeping is updated
@@ -118,42 +128,70 @@
118128 |BR|
119129 There are many algorithms to solve this. All with the same disadvantage: it takes a bit of time -- possible by
120130 “Spinlocking_” all other cores (for a few nanoseconds). As Critical-Sections a usually short (e.g a assignment, or a
121-few lines) the overhead can be huge(!) [#timesCPU]_.
131+few lines) the overhead can be (relatively) huge [#timesCPU]_!
122132
123133
124134 Messages
125135 --------
126136
127137 A more modern approach is Message-Passing_: a task sends some information to another; this can be a message, some data,
128-or an event. In all cases, there is a distinct sender and receiver -- and apparently no common/shared memory--, so no
129-Critical-Sections [#MPCS]_ are needed; at least no explicitly. Messages can be used by all kind of task; even in a
138+or an event. In all cases, there is a distinct sender and receiver -- and apparently no common/shared memory-- so no
139+Critical-Sections [#MPCS]_ are needed; at least not explicitly. Messages can be used by all kind of task; even in a
130140 distributed system -- then the message (and it data) is serialised, transmitted over a network and deserialised. Which
131141 can introduce some overhead and delay.
132-|BR|
133-Many people use this networking mental model when they thing about Message-Passing_, and *wrongly* assume there is
134-always serialisation (or network) overhead. When (carefully) implemented this is not needed; and can be as efficiently
135-as shared-memory (assuming there is shared-memory that can be used).
136142
137-This “message
138-There is an analogy between Message-Passing_ and Events_ (and the event-loop). They have separate history, but are quite
139-similar in nature. Like a “message”, the “event” is also used to share data (& controll) to isolated “tasks”.
143+As you may have noticed, there is an analogy between Message-Passing_ and Events_ (and the event-loop). They have
144+separate history, but are quite similar in nature. Like a “message”, the “event” is also used to share data (& controll)
145+to isolated “tasks”.
140146
141-Variants
147+.. warning::
148+
149+ Many people use the networking mental model when they thing about Message-Passing_, and *wrongly* assume there is
150+ always serialisation (or network) overhead. This is not needed for parallel cores as they typically have shared
151+ memory.
152+
153+ Then, at developers-level we can use the Messaging abstraction, where the compiler will translate it into shared
154+ memory instructions at processor level.
155+
156+There are many variant on messaging, mostly combinations some fundamental aspects
157+
158+(Un)Buffered
159+~~~~~~~~~~~~
160+Messages can be *buffered*, or not. It’s not a characteristic of the messages itself, but more about the channel that
161+transport the messages: How many messages can be stored ‘in’ the channel (often depicted a a queue). When there is no
162+storage at all the writer and reader needs to rendezvous: send and receive at the same (wall) time.
163+|BR|
164+When a buffer is available, this is not needed. Depending on the size of the buffer, some messages can be send before
165+the are picked-up bt the receiver. Note this is always asymmetric: messages need to be send before the can be read.
166+
167+Blocking
142168 ~~~~~~~~
169+Both the writer and the reader can be blocking (or not). A blocking reader it will always return when a messages is
170+available -- and will pauze until then. Also writers can be blocking: it will pauze until the message can be send --
171+e.g. the reader is available (rendezvous) or a message-buffer is free.
143172
144-There are may variants of Message-Passing_
173+(A)Synchronous
174+~~~~~~~~~~~~~~
175+Synchronous messages resembles normal function-calls. Typically a “question” is send, the call awaits the
176+answer-messages, and that answer is returned.
177+
145178
146179
147180 ************************************************************
148181
149-.. todo:: All below is draft and needs work!!!!
182+.. todo::
150183
184+ * Async, buffered, blocking
185+ * Pipe : kind of data messages
186+
187+
188+ .. todo:: All below is draft and needs work!!!!
151189
152190
153191 Models
154192 ======
155193
156-Probably the oldest model to described concurrency is the Petri-net_; which has the disadvantage that is synchronous
194+Probably the oldest model to described concurrency is the
157195 (all tokens move at the same timeslot) -- which is a hard to implement (efficiently) on Multi-Core_.
158196
159197 Actors
@@ -183,6 +221,10 @@
183221 |BR|
184222 But that condition does apply to Multi-Core_ too. Although the (timing) numbers do differ.
185223
224+.. [#wall-time]
225+ As reminder: We speak about *CPU-time* when we count the cycles that a core us busy; so when a core is waiting, no
226+ CPU-time is used. And we use *wall-time* when we time according the “the clock on the wall”.
227+
186228 .. [#OOCS]
187229 The brittleness of Critical-Sections_ can be reduced by embedding (the) (shared-) variable in an OO abstraction. By
188230 using *getters and *setters*, that controll the access, the biggest risk is (mostly) gone. That does not, however,
@@ -214,4 +256,4 @@
214256 .. _Actor-Model: https://en.wikipedia.org/wiki/Actor_model
215257 .. _Actor-Model-Theory: https://en.wikipedia.org/wiki/Actor_model_theory
216258
217-.. _Petri-Net: https://en.wikipedia.org/wiki/Petri_net
259+
diff -r 127f37d96bce -r f02f0584978b CCastle/2.Analyse/9.ActorAbstraction.rst
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/CCastle/2.Analyse/9.ActorAbstraction.rst Thu Aug 11 16:10:02 2022 +0200
@@ -0,0 +1,33 @@
1+.. include:: /std/localtoc.irst
2+
3+.. _ActorAbstractions:
4+
5+================================
6+Actor Abstractions (Busy, DRAFT)
7+================================
8+
9+.. post::
10+ :category: Castle DesignStudy
11+ :tags: Castle, Concurrency
12+
13+ The Actor-Model_ which is inspired by physics, unlike older models like the famous Petri-net_ that are mostly
14+ mathematical. And has become popular for Distributed-Computing_, where each node/computing act as in independent,
15+ active actor.
16+
17+ As an abstraction, this is similar to the “Many Core” :ref:`CC` we use for CCastle. Hence, we study this model a bit more
18+
19+
20+
21+
22+
23+----------
24+
25+.. rubric:: Footnotes
26+
27+.. [#XXX]
28+ YYY
29+
30+.. _Petri-Net: https://en.wikipedia.org/wiki/Petri_net
31+.. _Actor-Model: https://en.wikipedia.org/wiki/Actor_model
32+.. _Actor-Model-Theory: https://en.wikipedia.org/wiki/Actor_model_theory
33+.. _Distributed-Computing: https://en.wikipedia.org/wiki/Distributed_computing
diff -r 127f37d96bce -r f02f0584978b CCastle/2.Analyse/short_FSM-vs-Grammar.rst
--- a/CCastle/2.Analyse/short_FSM-vs-Grammar.rst Sat Jul 23 20:45:18 2022 +0200
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
@@ -1,24 +0,0 @@
1-FSMs vs Grammars
2-=================
3-
4-In case you are wondering: *“Are FSMs and Grammars related, or the same?”*
5-
6-The are related, but not equal.
7-
8-As discovered by `Chomsky <https://en.wikipedia.org/wiki/Noam_Chomsky>`__ there is a `hierarchy <https://en.wikipedia.org/wiki/Chomsky_hierarchy>`__ of languages (or actually grammars). As Chomsky was studying all kind of languages, include natural “human” language, his perception differs from our interpretation. |BR| When we (SW/Engineers) speak about grammars, we typically refer to “Context free grammars” (`CFG <https://en.wikipedia.org/wiki/Context-free_grammar>`__) -- Chomsky classify them as “Type-2”.
9-
10-A non-deterministic PushDown machine (`PDA <https://en.wikipedia.org/wiki/Pushdown_automaton>`__) is needed to
11-recognise a “Type-2” (CFG) Grammar. (Which is a simple version of a stack-machine, which is lot more restricted then a Turing Machine).
12-|BR|
13-And as a FSM is one of the most restricted machines that exist --it can recognise only “Type-3” `Regular grammar
14-<https://en.wikipedia.org/wiki/Regular_grammar>`__-- a FSM can’t be used to recognise a (CFG) grammar.
15-
16-.. seealso::
17-
18- .. postlist::
19- :category: Castle
20- :tags: FSM
21-
22- .. postlist::
23- :category: Castle
24- :tags: Grammar